#### Install programs

I installed ipython-3-notebook (in Debian Jessie) from the synaptic package manager.In order to install the R module, I installed PIP for python 3 in the synaptic package manager. PIP is the Python Package Index, a module installation tool. Then I used pip3 to install rpy2

sudo pip3 install rpy2There is a blog post on how to avoid using sudo to install pip modules.

Install statsmodel, a module for statistical modelling and econometrics in python. Maybe I should have installed python-statsmodels as a Debian package instead? But I it seems to be linked to python 2.x instead of python 3 (it had a dependency on python 2.7-dev). Therefore I installed statsmodels with pip3, using the --user flag mentioned above to install is as a user only module.

pip3 install --user statsmodelsThe installation took several minutes on my system. It seemed to be installing a number of dependencies. Many warnings about variables defined but not used were returned but the installation kept running. The final message was:

Successfully installed statsmodels numpy scipy pandas patsy python-dateutil pytz

Cleaning up...

#### Starting the Ipython notebook

Move to a directory where the notebooks will be stored, start a ipython notebook kernelcd python

ipython3 notebook

#### Shortcuts

See also the Ipython Notebook shortcuts. Useful shorcuts are ESCAPE to go in navigation mode, ENTER, to enter edit mode. It seems one can use vim navigation keys j and k to move up and down cells. Pressing the "d" key twice deletes a cell. CTRL+ENTER run cell in place, SHIFT+ENTER to run the cell and jump to the next one, and ALT+ENTER to run the cell and insert a new cell below.#### Run R commands in the Ipython notebook

Load an ipython extension that deals with R commands

%load_ext rpy2.ipythonDisplay a standard R dataset

%R head(cars)Use data from the python statsmodels module based on this page.

%R plot(cars)

import statsmodels.datasets as sdPrint column names of the dataset

data = sd.longley.load_pandas()

print(data.endog_name)Print a dataset as an html table by simply giving its name in the cell. For example this data frame contains exogenous variables:

print(data.exog_name)

data.exogPython can pass variables to R with the following command:

totemp = data.endogEstimate a linear model with R

gnp = data.exog['GNP']

%R -i totemp,gnp

%%RPlot the datapoints and linear regression with the ggplot2 package

fit <- br="" gnp="" least-squares="" lm="" nbsp="" regression="" totemp="">print(fit$coefficients) # Display the coefficients of the fit.

plot(gnp, totemp) # Plot the data points.

abline(fit) # And plot the linear regression.

%%R

library(ggplot2)

ggplot(data = NULL, aes(x =gnp, y = totemp)) +

geom_point() +

geom_abline( aes(intercept=coef(fit)[1], slope=coef(fit)[2]))

## No comments:

Post a Comment