Friday, December 18, 2015

Using SSH keys to access remote servers and git repositories

An SSH key can be used to access a virtual private server or a remote git repository without the need to enter a password every time. By sharing your public key with the remote server, your compter is authenticated as a trusted access point.


Creating SSH keys 

In Debian GNU Linux, using the Gnome desktop, you can create a private and public SSH key pair with for example the seahorse key manager. Under File / New / Secure Shell Key.

Created keys will be visible under ~/.ssh/ the private key is called id_rsa and the public key id_rsa.pub. You should only share the public key.

At the command line, you can create keys with
ssh-keygen -t rsa -C "your_email@example.com"

Virtual Private Server

I bought a virtual private server with Debian pre-installed. A public key can be added in the file ~/.ssh/authorized_keys. When connected to the server, edit the file:
vim ~/.ssh/authorized_keys
You might need to change access permission to that file as explained in this gist.

Bitbucket

Your public key can be added to your bitbucket account under manage account / security / SSH keys. This page explains how to use the SSH protocol with Bitbucket in more details.

Github

Your public key can be added to your Github account under profile / settings / SSH key. More details on how to generate and use SSH keys for github.

Then at the top of your Github repository you should see the "clone URL". Copy the SSH URL, in the form: git@github.com:yourusername/yourrepository.git
Add it as a remote origin:
git remote add origin git@github.com:yourusername/yourrepository.git
If there was already a remote repository you might need to delete it first with git remote remove origin.



The push and set the remote repository as an upstream repository:
git push --set-upstream origin master
Subsequent push can be simply made with
git push

See also

See also my other blog posts on the bash shell commands and on git commands.

Wednesday, November 25, 2015

Ruby, Perl, R, Bash

A comparison of some programming languages, couldn't add python because it's not recognised as a programming language by the Google trend website.

The trend for one keyword is relative to all other searchers over the same time period. The decreasing trend of Perl in this graph does not mean that searches for Perl decreased in absolute number. It means that the proportion of these searches to the overall Google searches was decreasing. How Trends data is adjusted.

Tuesday, October 20, 2015

Data integration with Knime and the R statistical software

I am testing the Knime software to create data pipelines. I started by installing the following extensions:
  •   KNIME Connectors for Common Databases    
  •   KNIME Interactive R Statistics Integration    

Database operations


I tried chaining the node database Row filter after database selector (containing an SQL statement of the form "select * from table"). But the query was taking ages because my source table is rather large.  I replaced the SQL statement in the node database row filter by a statement of the form "select * from table where code = 999". This time the query was much shorter.
Unlike dplyr which updates the SQL query - based on the group_by(), select(), filter() verbs - before  executing a final SQL query, it seems that Knime is executing all SQL queries one after the other.

Interaction with the R statistical program


Then I pushed the data to R. input data frame is called knime.in One issue is that most character vectors are transformed into factors. This was causing various errors. max(year) returned an error, and various merge operation were failing. I had to tell R to change back all those column types to character or numeric.

I wanted to use a filter before using a plot. But I needed to filter on 2 columns. I didn't know how to implement this in Knime. A Google search returned this forum. Rule based row filter seems to work.




In the workflow above, I used R View to display  a plot generated with ggplot.

Workflow are a nice way to display data integration steps and probably easy to explain to others. Node configuration is rather straightforward, once you have found the right node in the repository. I haven't figured out yet how to use input forms and flow variables.

I don't know how easy it is to maintain functional workflows on the long term.

Monday, September 14, 2015

Programming a test harness

I would like to build a test harness around programs. Automated tests should increase my confidence in the reproducibility of their outcome.
"Whenever you are tempted to type something into a print statement or a debugger expression, write it as a test instead." — Martin Fowler. Quoted here.

Where to store test data

While trying to find out where to place test data, this answer thought me to distinguish between unit tests, which are meant to test each function individually on small mock data and integration tests, which would be based on a larger, real dataset.

Testthat

In a commit called "Don't attach dplyr backends", Hadley Wickham removed direct function calls from loaded packages. Probably to ensure that packages are not loaded directly, he changed function calls to a form of packagename::function().

The author of the testthat R package wrote that autotest
"[...] promotes a workflow where the only way you test your code is through tests. Instead of modify-save-source-check you just modify and save, then watch the automated test output for problems."

Debian Continuous Integration

ci.debian.net

"How often are test suites executed?
The test suite for a source package will be executed:

  • when any package in the dependency chain of its binary packages changes;
  • when the package itself changes;
  • when 1 month is passed since the test suite was run for the last time."

Online Continuous Integration


Wednesday, June 17, 2015

Rstudio tips - Key bindings to program and explore data with R


Rstudio is an editor for the R statistical programming language which can be installed on windows, mac and Linux. See my post explaining R setup under Debian.

Edit code

  • TAB auto complete object names
  • F1 on a function name shows the help page of that function
  • F2 on a function name jumps to the code where that function was created. I found this key so useful that I decided to created this blog post.
  • CTRL+W  close a tab
  • CTRL+F find and replace text
  • CTRL+SHIFT+F find in all files in a directory (like grep), then click on results lines to jump in the files

Explore data

In the environment window, click on a data frame to view it then click on filter to filter the data frame according to various criteria.

Create pdf or html reports

When editing a markdown .Rmd document, the pdf or html report can be generated with CTRL+SHIFT+K.

Create a package

The R packages book by Hadley Wickham explains how to create R packages. Useful short-cuts when working with packages:
  • CTRL+SHIFT+B build the package
  • CTRL+SHIFT+D generate documentation
  • CTRL+SHIFT+T run devtools::test()
The documentation step can be set to run automatically with the package building under build / configure build tools / generate documentation with Roxygen / configure.

Vim mode

Vim mode can be activated under Tools / Global options / Code. Enter command mode with ":" and ask for ":help".  I use primarily the following keys:
  • jklhw$ggG navigate text
  • iaoA enter edit mode to insert text
  • Escape return to navigation mode
  • v select text
  • ypP copy selected text and paste
  • d delete 
  • /nN search 

Wednesday, May 13, 2015

Entreprise Resource Planing trends

Interest for the search topics: Odoo (formerly open ERP), SAP ERP and Microsoft Dynamics in Germany (link to trends for the whole world).
This Google trend chart shows interest in search topics, not just search query. Which means, that trends also include queries that are related to the search topic but do not contain the same exact wording. Explanation from the Google Trends page:
"When you measure interest in a search topic (Tokyo - Capital of Japan) our algorithms count many different search queries that may relate to the same topic (東京, Токио, Tokyyo, Tokkyo, Japan Capital, etc). When you measure interest in a search query (Toyko - Search term), our systems will count only searches including that string of text ("Tokyo")."

Wednesday, April 29, 2015

How to display dplyr's SQL query

dplyr verbs can be chained to query a database without writing SQL queries. dplyr uses lazy evaluation, meaning that database queries are prepared and only executed when asked by a specific verb such as collect(). I was wondering if it is possible to display the SQL query generated by dplyr?

Indeed dplyr::explain() displays the SQL query generated by dplyr. I have copied a reproducible example below based on the dplyr database vignette.
 

Wednesday, April 08, 2015

Ipython notebook and R

I chose to use python 3. Several of the shell commands below have a "3" suffix in Debian testing as of April 2015: ipython3, pip3.

Install programs

I installed ipython-3-notebook (in Debian Jessie) from the synaptic package manager.

In order to install the R module, I installed PIP for python 3 in the synaptic package manager. PIP is the Python Package Index, a module installation tool. Then I used pip3 to install rpy2
sudo pip3 install rpy2
There is a blog post on how to avoid using sudo to install pip modules.

Install statsmodel, a module for statistical modelling and econometrics in python. Maybe I should have installed python-statsmodels as a Debian package instead? But I it seems to be linked to python 2.x instead of python 3 (it had a dependency on python 2.7-dev). Therefore I installed statsmodels with pip3, using the --user flag mentioned above to install is as a user only module.
pip3 install --user statsmodels
The installation took several minutes on my system. It seemed to be installing a number of dependencies. Many warnings about variables defined but not used were returned but the installation kept running. The final message was:
Successfully installed statsmodels numpy scipy pandas patsy python-dateutil pytz
Cleaning up...

Starting the Ipython notebook

Move to a directory where the notebooks will be stored, start a ipython notebook kernel
cd python
ipython3 notebook

Shortcuts

See also the Ipython Notebook shortcuts. Useful shorcuts are ESCAPE to go in navigation mode, ENTER, to enter edit mode. It seems one can use vim navigation keys j and k to move up and down cells. Pressing the "d" key twice deletes a cell. CTRL+ENTER run cell in place, SHIFT+ENTER to run the cell and jump to the next one, and ALT+ENTER to run the cell and insert a new cell below. 

Run R commands in the Ipython notebook


Load an ipython extension that deals with R commands
%load_ext rpy2.ipython
 Display a standard R dataset
%R head(cars)
%R plot(cars)
Use data from the python statsmodels module based on this page.
import statsmodels.datasets as sd
data = sd.longley.load_pandas()
Print column names of the dataset
print(data.endog_name)
print(data.exog_name)
Print a dataset as an html table by simply giving its name in the cell. For example this data frame contains exogenous variables:
data.exog
Python can pass variables to R with the following command:
totemp = data.endog
gnp = data.exog['GNP']
%R -i totemp,gnp
Estimate a linear model with R
%%R
fit <- br="" gnp="" least-squares="" lm="" nbsp="" regression="" totemp="">print(fit$coefficients)  # Display the coefficients of the fit.
plot(gnp, totemp)  # Plot the data points.
abline(fit)  # And plot the linear regression.
Plot the datapoints and linear regression with the ggplot2 package
%%R
library(ggplot2)
ggplot(data = NULL, aes(x =gnp, y = totemp)) +
    geom_point() +
    geom_abline( aes(intercept=coef(fit)[1], slope=coef(fit)[2]))

Wednesday, April 01, 2015

Virtual Machine setup for development purposes


Creating a Virtual machine with Vagrant and PuPHeT.


According to those 2013 stack overflow questions, there were many reasons not to develop in a VM, unless one had to specifically develop for several OS:
But in the same year, the PhPHet developer explained why he thinks that one has to develop in a virtual machine.

Running a VM 

I followed the vagrant instructions to install a basic VM.
vagrant init hashicorp/precise32 vagrant up
"The guest machine entered an invalid state while waiting for it
to boot. " [...] "If the provider you're using has a GUI that comes with it, it is often helpful to open that and watch the machine"
I started the virtual machine in virtual box, an error message came up: 
"VT-x is disabled in the BIOS. (VERR_VMX_MSR_VMXON_DISABLED)."
Under Machine / Settings/ System / Acceleration, I disabled the Hardware virtualisation. The VM could then start. This works for 32 bits systems. Unfortunately 64 bit systems require hardware virtualisation, this means I cannot change this setting for 64 systems. I'll have to enable VT-x in the BIOS later on.

After I installed Virtual box, my mouse was rendered invisible. This may be due to the fact that the mouse was captured and that I didn't know the host capture key (default to the right Ctrl key) to free the mouse from the virtual machine's window.

Connecting to the virtual machine

Connecting from the virtual box GUI. The default user is "vagrant" and password "vagrant".

Connecting with SSH into the machine from a command prompt:
 vagrant ssh

 

Shared folder

A folder can be share with the host operating system. In virtual Box settings for the machine, under shared folder, create a machine folder and set it to auto-mount in the guest operating system.

Other tools



Messages by the vagrant creator

Tao of hashicorp
Comparing Filesystem Performance in Virtual Machines Automation Obsessed

Tuesday, March 31, 2015

Gauss commands

Comments begin "/*" end "*/" or begin "@" end "@"

    /* Comments */
    @ Comments @


Change working directory:

    chdir
 

Load data 
The filename can be either a literal or a string. If the filename is in a string variable, then the ^ (caret) operator must precede the name of the string, as in:

    filestr = "data/filename.txt";
    loadm x = ^filestr;
 

Run a script 

    run file_name;
     

Indexing matrices
See help aptech.com.gauss.13.0/doc/LF.6-DataTypes.html
The statement

    y = x[1:3,5:8];
 

Will put the intersection of the first three rows and the 
fifth through eighth columns of x into the matrix y.

Plot

plotXY(datax[.,1], datax[.,2:cols(datax)]) 
plotXY(datay[.,1], datay[.,2:cols(datay)])

Gauss resources

Basic GAUSS workshop 2002
Aptech Tutorial, running a program file

Wednesday, March 25, 2015

Octave commands

I am trying to run Matlab based test statistics in GNU Octave.

Octave commands 

List variables available in memory
who %this is a comment
whos %provides class details
Change and display working directory
cd directory_name
pwd
Manipulate data structures:
x.a = 1;
x.b = [1, 2; 3, 4];
x.c = "string";
Display the value of a variable
disp(x)
Loop over a list of files
csvfiles = dir("*.csv")
for file= csvfiles'
fprintf(1,'Doing something with %s\n',file.name)
end
Creating character arrays
"In the MATLAB® computing environment, all variables are arrays, and strings are of type char (character arrays)."

Reading data from an Excel or CSV file

The test statistics I wanted to use loads data from an Excel file but this returned the error :" 'xlsread' undefined ". Reading excel file is provided by the IO package which is not installed by default. The package is available in the Debian repository under "octave-io" , with the description "This package [...] contains functions to [...] read Excel spreadsheet (xlsread) and OpenDocument spreadsheet (odsread)." It is based on Apache POI. Load the package an try to read a file:
pkg load io;
data=xlsread('file_name.xls');
xlsread returns an error "Detected XLS interfaces: None."  This forum post recommends to load the java and windows packages as well. Those packages are not available in the Debian repositories.
I decided to convert the Excel file to csv and use csvread instead.

The script now gives the same output as on a windows machine running Matlab.

Warning: possible Matlab-style short-circuit operator 

Short-circuit boolean operators explains that:
"MATLAB has special behavior that allows the operators ‘&’ and ‘|’ to short-circuit when used in the truth expression for if and while statements. The Octave parser may be instructed to behave in the same manner, but its use is strongly discouraged." [...]
I wonder why it is strongly discouraged. 
"To obtain short-circuit behavior for logical expressions in new programs, you should always use the ‘&&’ and ‘||’ operators."
I replaced "|" by "||" in the code.

Writing test results to a file

Matlab low level file IO, explains how to use fprintf (a vectorised implementation of the c function) to write text data to a file.

Thursday, March 12, 2015

Stata commands

Load csv data

cd /home/paul/ 
insheet using filename.csv

tsset and xtset for panel variables

The 2 commands are basically similar (STATA forum discussion). tsset mentions "If you tsset panelvar timevar, you do not need to xtset panelvar timevar to use the xt commands."
xtset country year

View available test results

How to access stored estimation results
 
Stata help: "to see what was returned from an estimation command", type:
ereturn list

Then display results with:
 display e(depvar)
matrix list e(b)

View the source code of a command

viewsource xtset.ado

Thursday, March 05, 2015

Why should research organisations release free software?

Research organisations protect software with Intellectual Property (IP) rights. Some of these IP rights authorise the release of source code but some prevent source code release. Within the organisation, a decision maker should ask herself:
  • Can the organisation pay a person or a group of person in years to come to maintain that program in the long run?
If the answer is no, read on.

Researchers frequently move to other job positions. Once a researcher has moved to another job, the software codes he/she wrote is likely to sit idle on the organisation's storage drives. When no insider knows how to modify a computer program's code, the value of that program for the organisation will depend on the possibility for outsiders to modify the code.
  1. If researchers outside the organisation are not allowed to update the software, it will not be used. IP rights preventing source code modification don't have any value.
  2. If on the other hand, the piece of software is released as free and open source software, researchers outside the organisations are likely to update the software once the need arises. IP rights ensure that the first creator's contribution with its organisation's affiliation remains cast in the software's stone. An acknowledgement mentioning the organisation will travel with the piece of software as long as this piece of code is useful. This is likely to attract future project contribution and funding to the host organisation.

Monday, March 02, 2015

Panel cross section dependence tests in STATA and R


STATA example 

Using the Grunfeld investment data:

        use "http://fmwww.bc.edu/ec-p/data/Greene2000/TBL15-1.dta"
        xtset firm year
        xtreg i f c,fe
        xtcsd, pesaran


Output of the xtcsd command only:
Pesaran's test of cross sectional independence =     1.098, Pr = 0.2722

R example

Using the same data: 
library(foreign) # To import STATA .dta files
grunfeld <- font="" read.data="">"http://fmwww.bc.edu/ec-p/data/Greene2000/TBL15-1.dta")

pcdtest(i ~ f + c, data=grunfeld, model = "within", effect = "individual", index = c("firm","year"))
Ouput of the pcdtest command:
    Pesaran CD test for cross-sectional dependence in panels

data:  formula
z = 1.0979, p-value = 0.2722
alternative hypothesis: cross-sectional dependence

Thursday, February 26, 2015

Installing STATA on Debian GNU-LINUX


I needed to install STATA to collaborate with a colleague at work. The computer guy gave me the software on a disk, with an installation guide. Here are the commands I entered following those instructions:

Create a directory for Stata
# mkdir /usr/local/stata13
# ln -s /usr/local/stata13/ /usr/local/stata
Install Stata
# cd /usr/local/stata13
# /media/paul/Stata/install
Stata 13 installation
---------------------

  1.  uncompressing files
  2.  extracting files
  3.  setting permissions

Done.  The next step is to run the license installer.  Type:

        ./stinit
If the licensed software is Stata/IC 13, you will be able to run Stata/IC by typing
        xstata              (Run windowed version of Stata/IC)
        stata               (Run console  version of Stata/IC)

Run the license installer
./stinit
There follows some questions about user name and affiliation. "The two lines, jointly, should not be longer than 67 characters."
Then comes the message:
Stata is initialized.
You should now, as superuser, verify that you can enter Stata by typing

        # ./stata
or
    # ./xstata

I added this to my .bashrc so that stata and xstata can be used as a command directly:
 export PATH=$PATH:/usr/local/stata

Both command "stata" and "xstata" work as a normal user now.

There is an error message when running xstata:
'Failed to load module "canberra-gtk-module"'
But this was not a problem at the start.

GNOME application launcher


I added STATA to the GNOME application lancher, by typing "application" in the launcher, then "main menu", "new menu".

R to Stata

I use R most of the time for data analysis and will export csv files to STATA.
R command to export csv files:
write.csv(dtf, "filename.csv", row.names = FALSE, na = ".")
STATA command to import csv files:
insheet using "filename.csv", delimiter(",")


Wednesday, February 11, 2015

Big scientist

Hilary Mason:
Big data is data that cannot hold on one node.
[...] Some people spread the idea that big data will tell you what to do. [...] This is bullshit, it concerns me that this is starting to get steam outside of the tech community.
Neha Kothari
Linked In Hadoop cluster contains information on all clicks made by users. 1000 employees have access to the cluster and run queries on the data with pig. 
 Women in data science

Tuesday, February 10, 2015

LIbre office crash at startup

Since I opened a bizarre Excel document. Libre office crashed at startup. Deleting the user profile solves the issue:

rm -r .config/libreoffice/4/user/

Monday, February 09, 2015

Bulk resize images with a shell script and imagemagick

I like to resize images to 1000 pixels wide before sending them per email, it's large enough for most screens and I can send a dozen of pictures without cluttering my friends inbox. A pity that in 2015 there is still no easy way to do this integrated in the Gnome file manager. Maybe there is another way, or maybe I should use a proper image management program. Add a comment on how you resize images. 
 
But it's easy to create a shell script:
#!/bin/sh
mkdir -p small
for i in *.jpg; do
    echo $i;
    convert $i -resize 1000x small/$(basename $i .jpg).jpg;
done
Since I had images ending with .JPG extension in capital letters, I added a second loop:
for i in *.JPG; do
    echo $i;
    convert $i -resize 1000x small/$(basename $i .JPG).jpg;
done

Inspired by this question on stack overflow, and the mkdir complain corrected by looking at this question.

Thursday, February 05, 2015

Debian GNU-LINUX communications

Ekiga

Ekiga seems to be the tool to use for audio and video calls on the Gnome desktop. Ekiga is based on a communication framework called Telepathy which enables connection by multiple clients simultaneously.  What communications network should I use?

Skype

I use Skype to communicate with Windows users and also with some other Ubunutu-GNU/Linux users. Would it be possible to use other tools to communicate with them?

Recordmydesktop

Recordmydesktop can be started from the command line, a flag is required to use the pulse sound server:
recordmydesktop --device pulse

Stop recording with CTRL+C.
To use the GUI, the advanced / Sound / Device setting should also be changed to "pulse". To stop a recording started from the GUI, show the Gnome bottom bar, using the windows + M key, then press on the icon that appears there.

As explained in this Fedora forum, Gnome also has an own desktop recording system which can be started and stopped with CTRL + SHIFT + ALT + R.

 Webex

Some colleagues would like to use Cisco Webex. This requires the Java plugin for Firefox. Based on this blog, I installed it with:
$> apt-get install icedtea-7-plugin
This will install web browser plugin based on OpenJDK 7 and IcedTea. 

But there is no sound. Because Webex uses a proprietary 32 bit sound application, the only fix seems to be to install a 32 bit version of Firefox / Iceweasel.

Blog: Webex support on 64 bit Fedora Linux system explain that 32 bit version of has to be used:
"One means of successfully accessing WebEx from Fedora 12 x86_64 is to use a 32-bit version of Mozilla Firefox with Sun JRE and Adobe Flash 32-bit plugins.
Why 32-bit?
  1. Per the JRE download site, the 64-bit version does not have support for java applets or Java Web Start (JWS is required to run another WebEx like app named Elluminate).
  2. Per the WebEx System Requirements page only, 32-bit versions of Linux are supported."
Blog: Install 32 bit Firefox and Thuderbird on Debian 64 bit.
Blog: 32 bit applications on 64 bits Linux Mint

This user had an issue with i386 packages and explains how to remove them from his system. How do I remove all i386 architecture packages from my Debian installation?

The Debian multi architecture page explains how to install 32 bit programs (called the i386 architecture) on a 64 bit machine.

Tuesday, January 27, 2015

Patterns

Sometimes it helps to think in terms of design patterns.

10 years ago, a friend of mine offered me a book on architectural patterns by Christopher Alexander (A Pattern Language: Towns, Buildings, Construction) I remember beautifully simple description of architectural patterns in buildings such as: a place by the window (inside a house) or a "high place" to look around town, or avoid X junctions, keep only T junctions in residential areas.

How about statistical patterns?

Monday, January 26, 2015

Including R data and plots in a Latex document with knitr


knitr default typesetting:
"The chunk option out.width is set to '\\maxwidth' by default if the output format is LaTeX."

Monday, January 19, 2015

Read part of an Excel sheet into an R data.frame

Documentation of the read.xlsx function: 
read.xlsx(file, sheetIndex, sheetName=NULL, rowIndex=NULL,  startRow=NULL, endRow=NULL, colIndex=NULL,  as.data.frame=TRUE, header=TRUE, colClasses=NA,  keepFormulas=FALSE, encoding="unknown", ...)
Returns a data.frame.
Example of use:
dtf <- file="filename,<!-----" read.xlsx="">
    sheetName = sheetname,
    rowIndex = 2:10, 
    colIndex = 5:20)

Friday, January 16, 2015

Building an R package

A package can be seen as a coherent group of functions available for future projects. Building your own package enables you to reuse and share your statistical procedures. Function parameters and examples can be documented with Roxygen to facilitate digging back into the code later on. I created my second package  based on instructions from Hadley. My package structure is composed of the following folders:
  • R/   contains R code
  • test/  contains tests
  • inst/  contains files that will be exported with the package
  • docs/  contains .Rmd documents illustrating code development steps and data analysis.
  • data/    contains data sets exported with the package
  • data-raw/    contains raw dataset and R code to extract raw data from disk and from web resources.

Code

Create a directory containing a package skeleton
devtools::create("packagename")
RStudio has a menu build / configure build tools where devtools package functions and document generation can be linked to keyboard shortcuts:
  • document CTRL + SHIFT + D 
  • build and reload CTRL + SHIFT + B
devtools::load_all() or Cmd + Shift + L, reloads all code in the package.
Add packages to the list of required packages devtools::use_package("dplyr") devtools::use_package("ggplot2", "suggests")

Data

For data I followed his recommendations in r-pkgs/data.rmd devtools::use_data(mtcars) devtools::use_data_raw() # Creates a data-raw/ folder and add it to .Rbuildignore

Tests

Example of testing for the devtools package

Bash command to build and check a package

Bash command to build a package directory:
R CMD build packagename
Bash command to check a package tarball:

R CMD check packagename_version.tar.gz
 An error log (good luck for understanding it) is visible at:
packagename.Rcheck/00check.log
Generate the documentation and check for documentation specific errors
R CMD Rd2pdf tradeflows --no-clean
The --no-clean option keeps among other files a temporary Latex which can be inspected under:
packagename.Rcheck/packagename-manual.tex

Alternatively the build and check procedure can be run in RStudio as explained above.

Monday, January 12, 2015

Debian GNU-Linux installation on a HP EliteBook

A nice laptop provided by my employer, although the name "EliteBook" sounds like marketing rubbish. My colleague installed Debian 8 on the system, as a dual boot with windows 8.
See also my earlier posts tagged Debian.

Multi boot

The system is a dual boot with Windows. Debian was installed with a traditional - non UEFI - boot loader. Therfore  I have to press F9 each time I want to start Debian. This remains a minor annoyance since I usually restart only every 2 weeks or so (often when the battery has fully drained after a trip). On ordinary days, I put the system in sleep mode in the evening and it awakens in Debian again in the morning.

HP boot instruction didn't work to change the boot order.

 Language

The Debian language was set to French, change back to English.
Reconfigure locales:
As root, run :
dpkg-reconfigure locale

User privileges

as a super user
apt-get install sudo
Add a user to the super user group
adduser username sudo
Log out and log back in for this change to take effect. Now the user should be in the sudo group.

Wireless card

To use the intel non free wireless drivers iwlwifi, in the synapticpackage manager settings / repositories, add "main contrib non-free" to the sections of Jessie packages. Then install firmware-iwlwifi.
In a terminal, restart the modules
 modprobe -r iwlwifi ; modprobe iwlwifi 

Wireless network Eduroam

security: WPA & WPA2 entreprise
Authentication: Tunneled TLS
CA certificate: /etc/ssl/certs/addtrust_external_root.pem
Inner authentication: PAP
Username: user_name@institution.fr
Password: ******

Email and calendar with Evolution

I configured email. I use 2 step authentication and had to generate an application password to load events from my private Google calendar. To add a public Google calendar, I added an "on the web calendar" and inserted the ical link for that public calendar.

Under Edit / Preferences / Contacts / Automatic contacts / I selected to create an automatic contact when sending an email.

R and R studio

Since a previous post on R in Debian, I have changed Debian version from Wheezy to Jessie. Luckily, Jessie contains a recent version of R,I don't need to add the Cran repositories anymore.

Installed Rstudio.

Installed the folowing packages (at a R prompt or in RStudio):
install.packages(c("plyr", "reshape2", "ggplot2"))
install.packages(c("xtable", "markdown", "devtools"))
Install RmySQL, requires mysql development library. At the Shell:
sudo apt-get install libmysqlclient-dev
In R :
install.packages("RMySQL")
Installing R package xlsx requires the java development kit. In the system shell:
sudo apt-get install openjdk-7-jdk
sudo R CMD javareconf
Then in R:
install.packages("xlsx")

Firefox rebranded as iceweasel  + Adblock plus

Installed adblock plus.

Installed icedove to access outlook email backup

See my previous post on accessing outlook email archive from Debian.





Tuesday, January 06, 2015

Debian dist-upgrade from Wheezy to Jessie

I needed a couple of recent software versions (Lyx version> 2.1 and Latex), and they were not available in the wheezy-backports, therefore I decided to upgrade from Debian Wheezy, the current stable version to Debian Jessie, the testing version. Debian Install FAQ.

This page gives the configuration changes and commands to be used:

  1. Edit /etc/apt/sources.list and replace all occurrences of "wheezy" with "jessie".
  2. run apt-get update
  3. run apt-get upgrade
  4. run apt-get dist-upgrade
For several reasons the upgrade process didn't complete properly. After a system reboot, there was no networking, no graphical desktop and even after starting the Gnome desktop by hand with "startx", there were several issues with system settings, such as printer setup.

Networking

I had an issue with networking not working.
According to the debian page on network setup, changes below are not recommended (see below).

I fixed by editing/etc/network/interfaces
# Lines added from odoepner.wordpress.com
auto eth0 
allow-hotplug eth0
iface eth0 inet dhcp

Then ran as root
etc/init.d/networking start
And network was working again.

This page explains how to put this network as a managed interface again: 

But I shouldn't have done this.
According to the debian page on network setup: "Keep configuration of "/etc/network/interfaces" as simple as in the following".
auto lo
iface lo inet loopback

No desktop manager at startup

I realised that there was no desktop manager at startup. I first followed answers to this question and edited  /etc/inittab. But it is not needed, because the boot manager systemd doesn't look into that configuration file.
I spent some time reading about a systemd controversy in Debian and why it doesn't matter so much in the end.

For the moment I start the Gnome desktop from a terminal with the command:
xstart

It looks like not all package had been upgraded.
I ran as root
apt-get upgrade -f
The following packages will be upgraded:
  live-tools
[...]
Preparing to unpack .../live-tools_4.0.2-1_all.deb ...
dpkg-divert: error: rename involves overwriting `/usr/bin/uptime' with
  different file `/usr/bin/uptime.orig.procps', not allowed
From Synaptic, I did a complete removal of the package live-tools.
After that many configuration steps took place. Printer setup utility was working again. And now there only remains a bluetooth issue.

Bluetooth issue

dpkg: error processing package gnome-bluetooth (--configure):
 dependency problems - leaving unconfigured
Errors were encountered while processing:
 bluez
 bluetooth
 gnome-bluetooth

Lyx upgraded to version 2.1.2

At least Lyx was upgraded to version 2.1.2. I had to "reconfigure LyX with Tools→Reconfigure;".Then I could create documents with Lyx again. And the R integration worked as well.

Upgrade might not have been needed

Afterwards, I realised that an upgrade might not have been needed:
"If you want to install a single package in Debian, you do not need to update the whole system. It can be done with three commands by inserting the repos for Testing and Unstable in /etc/apt/sources.list, fixing the distro you (mostly) wish by setting APT::Default-Release "stable" in /etc/apt/apt.conf.d/local, then doing aptitude install packagae_name/testing -t testing"