Tuesday, November 18, 2014

Gandi CNAME records to map google sites


Map a google site to a custom domain.
  1. Click the More Actions button and select Manage Site.
  2. Click the Web Address tab.
  3. Enter your custom URL in the Web Address text box, then click the Add button at the top of the page.
GANDI CNAME records to map google site.
Enter your CNAME record using the examples listed in the CNAME record values table and the guidelines below: Name: Enter your custom URL prefix (such as mail or www).
Type: Select CNAME.
Value: Enter ghs.googlehosted.com., making sure to include a trailing dot (.) at the end of the value.

Be aware it may take as long as 72 hours before DNS changes are propagated, depending on the time to live (TTL) that was configured for your records. Until records have been updated worldwide, you will still receive traffic to your old server.


You can check DNS propagation with this tool:
In my case it took a few hours before the new CNAME was propagated in the whole world. The first DNS server to be updated were in California, it's as if they use a shorter time to live for CNAME records. 10 hours later there was only one DNS srever in Malaysia that wasn't up to date. 

Data manipulation with dplyr

Dplyr is a package for data manipulation developed by Hadley Wickham and Romain Francois for the R statistical software.

  • Introduction to dplyr
  • A Tutorial from João Neto (dplyr.Rmd) gives examples of tools for grouped operations: 
    • n(): number of observations in the current group
    • n_distinct(x): count the number of unique values in x.
    • first(x), last(x) and nth(x, n) - these work similarly to x[1], x[length(x)], and x[n] but give you more control of the result if the value isn’t present.
    • min(), max(), mean(), sum(), sd(), median(), and IQR()

Non standard evaluation

dplyr uses non standard evaluation. To use standard evaluation a work around has to be found. See Stackoverflow question.

Thursday, November 13, 2014

GNU screen for long running server processes

Use screen to keep a long process running on a server after you close the ssh session. I started a screen session with:
    screen -S sessionname
In order to find the screen session later you might want to rename it using sessionname. Or on the first screen invocation use the s flag -S sessionname
I started the R software in this screen session, started a long running process. Then detached the session with:
    CTRL-A-D
I could re-attach the session later with:
    screen -r sessionname
If the session was not detached properly, it might be necessary to detach it and re attach it:
   screen -d -r sessionname
Screen detach before leaving the ssh session. You may want to use the autodetach configuration option.

Log out of your ssh session. Log in back again later:
screen -ls # list sessions
screen -r sessionname # attach a session

There might be solutions to move an already started process to GNU screen but its not straightforward:
"You cannot do this, easily. I'd suggest making it a habit to start screen as the first thing you do after opening a console. However, for your actual problem, there's another thing you could try: after having launched your job from the terminal, background it by typing ctrl-z and then bg. After that, detach the job from it's parent shell; in bash you'd do disown -h %. After that, you can safely close the terminal and the job will continue running."

More tips in this discussion on screen.

Thursday, November 06, 2014

SSH tunnel and port forwarding


  Ubuntu Remote_Port_Forwarding

        ssh -C -D 1080 laptop

    http://straightedgelinux.com/blog/howto/socks.html
        ssh -N -D 1080 klaatu@home.linuxserver.com
    https://wiki.debian.org/SOCKS


http://www.debian-administration.org/article/449/SSH_dynamic_port_forwarding_with_SOCKS

        ssh -D 1080 shell.example.org
        tsocks thunderbird

Thursday, October 30, 2014

R, packages and Rstudio install on Debian wheezy


See also my previous post on Debian GNU-Linux installation on a Lenovo T400.

R install

I used the Synaptic package manager to add the R repository for Debian from a nearby mirror, under : settings / repositories / other software / add.
Add this APT line:
deb http://cran.univ-paris1.fr/bin/linux/debian/ wheezy-cran3/

There was an error:
W: GPG error: http://cran.univ-paris1.fr wheezy-cran3/ Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 06F90DE5381BA480
After looking at several forums, and this stackoverflow question, I installed debian-keyring and added the key with the commands:
gpg --keyserver pgpkeys.mit.edu --recv-key 06F90DE5381BA480
gpg -a --export 06F90DE5381BA480 |sudo apt-key add -
I could then install R version 3 from the synaptic package manager.

Rstudio

I downloaded R-studio and installed it. There was a missing dependency for libjpeg62. I installed that package from Synaptic. Then ran the dpkg command to install rstudio.
dpkg -i rstudio-0.98.507-i386.deb

Tools

Then I installed Git in order to clone my R project from an online repository.
git clone  project_repository_url

Packages

Within Rstudio, I installed a few packages:
install.packages(c("plyr", "reshape2", "ggplot2"))
install.packages(c("xtable", "markdown", "devtools"))

devtools

The devtools packages requires a libcurl dev Debian package. You can install it at the shell prompt:
$ sudo apt-get install libcurl4-gnutls-dev
Back at the R prompt
install.packages("devtools")
Other dependencies might be needed, the RStudio page on devtools recommends installing the Debian package r-base-dev.

dplyr

The dplyr package required the latest version of a Rcpp package. Which was not available on my CRAN mirror. I installed it from source, (based on this message):
install.packages("Rcpp", type = "source")
install.packages("dplyr")

xlsx

The xlsx package installation complained:
configure: error: Cannot compile a simple JNI program. See config.log for details.
Make sure you have Java Development Kit installed and correctly registered in R.
If in doubt, re-run "R CMD javareconf" as root.


Required the latest version of java 7. (inspired by this post). I installed openjdk-7 from the synaptic package manager. Then ran

update-alternatives --config java  # Choose java 7 as the default
R CMD javareconf
Then
install.packages("xlsx") # worked

RMySQL

MySQL client and server are installed on my system.
While installing RMySQL, I struggled with a configuration error:
  could not find the MySQL installation include and/or library
  directories.  Manually specify the location of the MySQL
  libraries and the header files and re-run R CMD INSTALL.
This post has an answer (thanks!):
sudo apt-get install libdbd-mysql  libmysqlclient-dev
That fixes the issue!
I can connect to the database
library(RMySQL)
mychannel <- br="" dbconnect="" host="localhost" user="paul" ysql="">                       password="***", dbname="dbname")

R packages which are better installed from the Debian package manager

Some packages, such as ‘minqa’, ‘SparseM’ and ‘car’ return an error when one tries to install them from the R prompt. The can only be installed from the Debian package manager, where they have names starting with "r-cran": "r-cran-car", "r-cran-sparsem", "r-cran-minqa".

Ready to work!


Thursday, October 09, 2014

Email backup from Microsoft Outlook to Tunderbird / Icedove under Debian

To read an outlook email archive, .pst file, on a Debian system:
  1. On the windows system in Thunderbird, I imported
  2. In Debian I accessed it from Icedove, (the rebranded thunderbird)

1. In Windows 

There was an error in the 3.9 GB pst file created in September 2014.

scanpst.exe

As explained on microsoft support, I tried to use the inbox repair tool: scanpst.exe It has found errors in the file I have asked it to repair the file. It copied the whole content in a backup file .bak. It created a log file and it corrected the archive. I could open it in Outllook, under file / data file management. I compacted the archive, it's now 3.7 Gb in size. I closed Outlook and put it in my online backup folder.

Thunderbird

To be able to access this mail archive under Linux.  I followed a Mozilla article which recommends to import outlook mail (including the .pst file) in Thunderbirdwith the menu:
Tools  / Import /   mail
Thunderbird can only open .pst archive files on a windows machine which has outlook installed. Then I closed Thunderbird and copied the file found under.

2. In Debian 

Icedove

I installed icedove from the synaptic package manager. I started it and closed it again. This created a directory in my home folder: .icedove, this is where I'll paste the Thunderbird backup.
Mozilla's explanation on how to restore Thunderbird backup :
  1. Locate the backed up profile folder on your hard drive or backup medium (e.g., your USB-stick).
  2. Open the profile folder backup (e.g., the xxxxxxxx.default backup).
  3. Copy the entire contents of the profile folder backup, such as the mimeTypes.rdf file, prefs.js file, bookmarkbackups folder, etc.
  4. Locate and open the new profile folder as explained above and then close icedove (if open).
  5. Paste the contents of the backed up profile folder into the new profile folder, overwriting existing files of the same name.
  6. Start Icedove.

Online data storage with mono /usr/lib/hubic/hubiC.exe main-loop

List of online content delivery platforms in French

I am testing Hubic, a French online storage service. I installed it on GNU-Linux, it's actually using a windows like executable through mono, the process running is the following:
mono /usr/lib/hubic/hubiC.exe main-loop

To show or sets current synchronization directory:
hubic syncdir
To print some information about account and running operations.
hubic status
To show options
hubic config
To set the option time between Synchronization
hubic config TimeBetweenSynchronization 30

Sunday, October 05, 2014

Create Ubuntu unity Launcher for the wine program irfanview


Irfanview was installed using the winetricks program, which requires to load an older version 4.33 found on oldapps.com
Then I created a launcher called  "irfanview.desktop" :
[Desktop Entry]
Type=Application
Name=Irfanview Program Loader
Exec=wine /home/paul/.local/share/wineprefixes/irfanview/drive_c/Program\ Files\ \(x86\)/IrfanView/i_view32.exe
MimeType=application/x-ms-dos-executable;application/x-msi;application/x-ms-shortcut;
Icon=/home/paul/.local/share/applications/irfanview.png
NoDisplay=true
StartupNotify=true
This launcher is a .desktop file based on an askubuntu answer which starts by suggesting to copy an existing .desktop file from :
ls /usr/share/applications/*.desktop
I copied the wine launcher.
I changed the image based on this Ubuntu Handbook post.

Right click menu "open with"

It's also possible to edit context menus. I tried editing: "/home/paul/.local/share/applications/wine-extension-jpe.desktop"

I replaced the line starting with Exec by the following:

Exec=env WINEPREFIX="/home/paul/.local/share/wineprefixes/irfanview" wine start /ProgIDOpen IrfanView %f
I adapted the exec command from this message.
Now when I right click on a JPEG image, open with / Irfanview. Irfanview starts with the image open.

Associate as default application

May require to edit this file:
/usr/share/gnome/applications/defaults.list
 And this page recomments to add a mime type in the folder:
/usr/share/mimelnk/application/


Monday, September 29, 2014

Make word, pdf and html documents with markdown and pandoc

Markdown is a simple text markup language.
Pandoc is a document converter. Pandoc demo and sample command.

Pandoc commands

Convert a markdown file to PDF :
pandoc -o README.pdf README.md
The pandoc man page says: "If  the input or output format is not specified explicitly, pandoc will attempt to guess it from the extensions of the input and output filenames." That's what happens above. However "The input format can be specified using the -r/--read or -f/--from options, the output format using the -w/--write or -t/--to options."


Makefile 

This phsychologist blogs about using a makefile to create beamer presentations.
This researchers providers a make file for pandoc templates.

With this simple make file, I can create Microsoft Word, HTML and PDF documents from the same markdown file:
all: docx pdf html

docx: file.md
        pandoc -o file.docx file.md

pdf: file.md
        pandoc -o file.pdf file.md

html: deliverable.md
        pandoc -o file.html file.md

clean:
        rm -f *.html *.pdf *.docx
To create all documents type
make
To create only a docx type
make docx
To delete all created document type
make clean

Improved makefile with variable

file.pdf : file.md
    pandoc -o file.pdf file.md

%.pdf: %.md
    pandoc -o $@ $<

Guide makefiles:
"Here, we have used the percent (%) character to denote that part of the target and dependency that matches whatever the pattern is used for, and the $< is a special variable (imaging it like $(<)) that means "whatever the depencies are". Another useful variable is $@, which means "the target"."


## Makefile to generate documents based on markdown files
## Inspired by this makefile
## https://github.com/kjhealy/pandoc-templates/blob/master/examples/Makefile
##
## I should use vraibles for filenames
## Command line to converts:

## How to make this using variables?
## No space allowed in file names there could be a replacement but I didn't try
## http://www.cmcrossroads.com/article/gnu-make-meets-file-names-spaces-them

## Markdown extension (e.g. md, markdown, mdown).
MEXT = md
## All markdown files in the working directory
SRC = $(wildcard *.$(MEXT))


DOCX=$(SRC:.md=.docx)
PDFS=$(SRC:.md=.pdf)
HTML=$(SRC:.md=.html)


all: $(PDFS)  $(DOCX)
pdf:    clean $(PDFS)
docx:   clean $(DOCX)
#html:   clean $(HTML)


#scrap : scrap.md
#    pandoc -o scrap.pdf scrap.md


# Separator for these lines need to start with a hard tab, not 4 spaces!
%.pdf: %.md
    pandoc -o $@ $<

%.docx: %.md
    pandoc -o $@ $<

clean:
    rm -f *.html *.pdf *.docx

Friday, September 26, 2014

Debian GNU/Linux


Configuring the system 

Disk partitioning

Recommended Partitioning Scheme: "For new users, personal Debian boxes, home systems, and other single-user setups, a single / partition (plus swap) is probably the easiest, simplest way to go."  [...] " For multi-user systems or systems with lots of disk space, it's best to put /usr, /var, /tmp, and /home each on their own partitions separate from the / partition."

Keyboard layout

I use a laptop with a Finnish keyboard and a docking station with a French keyboard. I chose the "SHIFT+ caps lock" keys to change from one keyboard layout to the other. By default the French keyboad has a comma as decimal separator on the numeric keypad. I changed this under:
 system settings / region and language / layouts / French / Options / Numeric keypad delete key behaviour / 4 level key with dot

Installing programs

Package management with apt-get or aptitude. Aptitude is recommended. Here is a comparison in a forum.

Programming

Install a version tracking system
sudo aptitude install git
Add coloration:
git config --global color.ui true

Install the vim text editor
sudo aptitude install vim
Edit the vim configuration file:
vim ~/.vimrc
In this file add:
syntax on
set tabstop=4
set expandtab
set softtabstop=4
set shiftwidth=4
filetype indent on

au BufNewFile,BufRead *.md set filetype=txt

Non-Free (unfortunately)

Skype on Debian wiki (for 64 architecture) try alternatives Ekiga and Linphone (Linphone use with French ISP free). Load skype for Debian from the website, then:
sudo dpkg --add-architecture i386
sudo apt-get update 
sudo dpkg -i skype-install_file_you_downloaded.deb 
sudo apt-get -f install
Adobe Flash
sudo apt-get install flashplugin-nonfree
sudo update-flashplugin-nonfree --install

Super user

As a super user install the program called "sudo" :
apt-get install sudo
how to-properly-configure-sudoers-file-on-debian-wheezy
I am in the sudo group
adduser paul sudo
  The user `paul' is already a member of `sudo'.
addgroup sudo
  addgroup: The group `sudo' already exists.
But I get the  warning message:
paul is not in the sudoers file.  This incident will be reported.
It seems I have to restart the system.

 Web browser

Why Firefox has been rebranded as iceweasel.
"Debian was initially given permission to use the trademarks, and adopted the Firefox name. However, because the artwork in Firefox had a proprietary copyright license which was not compatible with the Debian Free Software Guidelines, the substituted logo had to remain."

Wednesday, July 09, 2014

VIM commands

Help

  • :help  -  vim help
  • :help commandname - help on a particular command
  • CTRL+] - jump to a highlighted topic
  • CTRL+T - jump backwards

Motion

  • :help left-right-motion
  • j,k move up down
  • h,l move left right
  • b,w move previous or next word
  • ctrl+b, ctrl+d move page up or page down

Undo redo

  • u: undo last change (can be repeated to undo preceding commands)
  • Ctrl-R: Redo changes which were undone (undo the undos). 
  • Compare to '.' to repeat a previous change, at the current cursor position. Ctrl-R will redo a previously undone change, wherever the change occurred. 

Switch between navigation and editing mode

  • A - move to the end of the line and switch to editing mode 
  • I - switch to editing mode at the current place
  • Escape - switch to navigation mode
  • alt+h alt+j alt+k alt+l - switch to navigation mode and move
  • alt+: - switch to navigation mode and send a command

Search and replace characters

Vim wiki on search and replace 
  • :s/foo/bar/g Find each occurrence of 'foo' (in the current line only), and replace it with 'bar'. 
  • :%s/foo/bar/g Find each occurrence of 'foo' (in all lines), and replace it with 'bar'.
  • %s/option value=".*"//g remove all beginnings of line. :%s/\option\n/, /g replace all end of line by comma + space. This cleans an html list of species for inclusion in a text.

Markdown

Display a list of first level header in a markdown document (found in quick markdown navigation/TOC)
:g/^# /#
Then enter the line number to jump to that line.

Line numbers

 Display line numbers
:set nu
Disable line numbers
:set nonu

Editing a whole line

  • dd to delete a whole line
  • yy to copy a whole line
  • p to paste the copied or deleted text after the current line or 
  • P to paste the copied or deleted text before the current line 

Copy, cut and paste

  • Position the cursor where you want to begin cutting.
  • Press v (or upper case V if you want to cut whole lines).
  • Move the cursor to the end of what you want to cut.
  • Press d to cut or y to copy.
  • Move to where you would like to paste.
  • Press P to paste before the cursor, or p to paste after. 

Indentation

Indentation replaced by spaces, add this to the ~/.vimrc file
set tabstop=4
set expandtab
set softtabstop=4
set shiftwidth=4
filetype indent on 
More details on vim indentation in the python wiki.

Multiple files and windows

  • :e filename - edit another file 
  • :ls         - show current buffers
  • :b 2        - open buffer #2 in this window
  • :b filename - open buffer #filename in this window
  • :bd         - close the current buffer (! to forget changes)
  • :bd filename -close a buffer by name 

Windows

  • :sp[lit] filename  - split window and load another file
  • :vs[plit] - same but split vertically  
  • ctrl-w up arrow - move cursor up a window
  • ctrl-w ctrl-w   - move cursor to another window (cycle)
  • ctrl-w_         - maximize current window
  • ctrl-w=         - make all equal size
  • CTRL+z - suspend the process and get back to the shell
  • fg - get back to vim

Vimdiff

View differences between file1 and file2 (vim documentation)
vimdiff file1 file2

spell check

Set spell check only in the local buffer:
:setlocal spell spelllang=en_gb  
 Turn spell check off
:set nospell

Mark word as correct, this creates a spell file under /home/user/.vim/spell:
zg
Mark word as incorrect
zw

Plugins for programming languages

.vimrc

Text colour.
Add syntax highlight to your .vimrc
syntax enable
How to add a file extension to vim syntax highlight
au BufNewFile,BufRead *.dump set filetype=sql
I used it to display markdown files as text files:
au BufNewFile,BufRead *.md set filetype=txt

Monday, May 26, 2014

Debian GNU-Linux installation on a Lenovo T400


I needed an operating system more stable than Microsoft windows for my daily tasks such as: programming with the R statistical software and writing PDF reports with the Lyx document processor. I read about John MacFarlane (author of pandoc) who is using Debian with the xmonad desktop. This blogger documented why he switched from Ubuntu to Debian. The same blogger also wrote interesting posts at the electronic Frontier Fondation on privacy issues with Ubuntu: on the one side Ubuntu offers an easy way to install full Hard Drive encryption, on the other side, Ubuntu's default desktop search sends search requests over unencripted internet (I should move this additional content to another post).

15 years ago already, I had talked with a friend who was using Debian. I wonder if he still is?

I decided to give Debian a try.

Creating a bootable USB stick

Trying the live version from a USB stick

I introduced the USB key in the laptop,  pressed the "blue thinkvantage" key, entered the BIOS setup and changed the boot order. I placed the USB devices first in the list. The live version seemed to work fine, the Gnome desktop was responsive enough on that machine (Lenovo Thinkpad T400). The external screen was easy to set-up, and network access configured automatically. So I decided to install that Debian system on the hard drive.

Installation

I restarted the laptop with the USB key inserted. I chose the graphical installer. There was a small issue with the fact that the installer was looking for a CD-ROM. But no CD-ROM was available because I was running the install from a USB stick. Based on this blog post, I jumped to the command line interface and typed:
mount /dev/sdb /cdrom 
This command fixed the issue, installation could continue. I partitionned my hard drive to leave 100 GB for the windows partition and the rest (144 GB) for the new GNU-Linux system. Partitioning took a frightening amount of time. During which I thought the system as frozen. But after maybe an hour, installation carried on... I restarted the system. I am now writing this blog post from a fresh Debian system.

Hardware support

Docking station

Hotpluging usb keyboard was working on the laptop itself, but not working on the docking station. A quick fix was to disconnect and reconnect the docking station.
I have connected the docking station to an external larger screen I set this up as my main monitor under system settings / displays.

Different Keyboard layouts

My laptop has a Finnish keyboard, and I am located in France, therefore I use a French keyboard. Under system settings / region and language settings I could add the French keyboard layout. The fact that it's possible to "use the same layout for all windows" is a plus compared to windows. I regularly switch from the Finnish to the French keyboard layout. In windows, I had to change the keyboard layout for all windows one by one.

Network drive

From the Nautilus file manager, I could access windows shared drive using my company's internal network login details.
I have been trying to synchronize this folder using Unison (GUI for rsync). Setting up a local synchronization requires to mount the windows share in the file system. mount.cifs from cifs utils can mount windows shared drives.

Network printer

Under system settings /  printers, the network printer appears. But I didn't manage to connect to the Sharp MX 2600 N.

Wireless card

Show all hardware
lspci
 Network controller: Intel Corporation PRO/Wireless 5100 AGN
Install support for wifi device on Debian
apt-get install firmware-iwlwifi
Reinsert this module to access installed firmware:
# modprobe -r iwlwifi ; modprobe iwlwifi
Wifi works! :-)

Location of software packages

A mirror for France is located at http://ftp.fr.debian.org/debian/.

Install research software 

See also other software install tips in this post on Debian GNU Linux
Lyx editor for Latex
sudo aptitude install lyx
Jabref
sudo aptitude install jabref

Evolution with Microsoft Exchange plugin

My Company uses microsoft exchange on its email servers. An "exchange" plugin can be installed from the Synaptic package manager but this only works for exchange 200 and 2003. This site recommends using the mapi plugin instead which should work for exchange 2007 and 2010. As explained here, my company also uses a different server for the web mail access and the internal connection to the mail server.

To find the internal sever name, I went to the windows version of outlook 2007. Right click on mail box / properties / advanced ... / Microsoft Exchange server.

Mail loads fine now. I can send email too.

Calendar events are not displayed, even though outlook calendar events appear within the desktop calendar (The little calendar that appears when clicking on the desktop clock). This page says that I should upgrade to a more recent version of Evolution. This would require the use of Debian backports. Couldn't get backports to work for now. It said that I already have the most recent version of evolution installed.
I add this line to the 
# Backports repository
deb http://ftp.debian.org/debian wheezy-backports main contrib non-free
Ran this command
sudo apt-get update
sudo apt-get -t wheezy-backports install evolution
Got this message
evolution is already the newest version.
Well it's actually normal because evolution is currently (August 2014) not in the list of Debian backports.

R and Rstudio install on Debian wheezy 7

Instruction below kept here for historical purposes, I'll update these R specific instruction in another blog post.

I used the Synaptic package manager to add the R repository for Debian from a nearby mirror, under : settings / repositories / other software / add.
Add this APT line:
deb http://cran.univ-paris1.fr/bin/linux/debian/ wheezy-cran3/

There was an error:
W: GPG error: http://cran.univ-paris1.fr wheezy-cran3/ Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 06F90DE5381BA480
After looking at several forums, and this stackoverflow question, I installed debian-keyring and added the key with the commands:
gpg --keyserver pgpkeys.mit.edu --recv-key 06F90DE5381BA480
gpg -a --export 06F90DE5381BA480 |sudo apt-key add -
I could then install R version 3 from the synaptic package manager.

I downloaded R-studio and installed it. There was a missing dependency for libjpeg62. I installed that package from Synaptic. Then ran the dpkg command to install rstudio.
dpkg -i rstudio-0.98.507-i386.deb
Then I installed Git in order to clone my R project from an online repository.
git clone  project_repository_url
Within Rstudio, I installed a few packages:
install.packages(c("plyr", "reshape2", "ggplot2"))
install.packages(c("xtable", "markdown", "devtools"))
Ready to work!

Further reading

Wiki on installing Debian on Thinkpad laptops.

Tuesday, May 13, 2014

Ubuntu and windows share

I'd like to use Ubuntu at work, while still beeing able to access and edit data on a shared windows network drive (bibliography collection of PDF files, current and archived project documents, presentations at events, datasets, ...). I have approx 4 GB of data.

Using the CIFS (Common Internet File System) from microsoft, it's possible to permanently map a windows network drive on a UBUNTU-GNU-LINUX desktop:
Canonical dropped support for Ubuntu One: "the free storage wars aren’t a sustainable place for us to be, particularly with other services now regularly offering 25GB-50GB free storage."

I could try unison

Wednesday, April 23, 2014

R commands

See also why use R and the RSS feed of posts labelled R.

R code in this post is garbage this shows the limits of blogger for displaying R code which contains the assignment operator <- .="" a="" href="http://www.r-bloggers.com/three-ways-to-format-r-code-for-blogger/" is="" one="" solution="" to="">paste html from knitr documents
.


R mailing list: Use < - data-blogger-escaped-assignment="" data-blogger-escaped-comment----="" data-blogger-escaped-for="" data-blogger-escaped-functions="">

Set operations


x = letters[1:3]
y = letters[3:5]
union(x, y)
## [1] "a" "b" "c" "d" "e"
intersect(x, y)
## [1] "c"
setdiff(x, y)
## [1] "a" "b"
setdiff(y, x)
## [1] "d" "e"
setequal(x, y)
## [1] FALSE

Information about your R system

sessionInfo()
installed.packages()

Handling files

getwd()
list.files(tempdir()) 
dir.create("blabla")
read.csv("data.csv")

Lists

Given a list structure x, unlist simplifies it to produce a vector which contains all the atomic components which occur in x.
l1 <- a="a" b="2," c="pi+2i)" font="" list="" nbsp="">
unlist(l1) # a character vector
x<- 1="" br="">
x<-1

S3 methods

x<-1



List all available methods for a class:

methods(class="lm")

 One liners

Remove all objects in the workspace except one :

rm(list=ls()[!ls()=="object_to_keep"])

knitr

Those 2 commands are different.
Sets the options for chunk, within a knitr chunk inside the .Rmd document

opts_chunk$set(fig.width=10)
 Sets the options for knitr outside the .Rmd document

opts_knit$set()

dplyr

pipes
cars %>%
  group_by(speed) %>%
  print %>%
  summarise(numberofcars = n(),
            min = min(dist),
            mean = mean(dist),
            max = max(dist))

group_by() creates a tbl_df objects which is a wrapper around a data.frame to enable some functionalities. Note that print returns its output on a tbl_df object. So print() can be used inside the pipe without stopping the workflow.


 plyr (I replaced it with dplyr)

progress bar

l_ply(1:100000, identity, .progress = "win")
Rename items in a dataframe with revalue

sawnwood$item <- br="" item="" revalue="" sawnwood="">    c("Sawnwood (C)" = "Sawnwood Coniferous",
   "Sawnwood (NC)" = "Sawnwood Non Coniferous"))
Rename column names by their names



rename(mtcars, c("disp" = "displacement"))

Plotting with ggplot2


Monday, April 14, 2014

JabRef to manage a bibliography

Jabref is a free software than can save a bibliography in bibtex format. This format can be used to import citation in windows or latex based text editing system such as Lyx.

Link to PDF or other files

Jabref can automatically create links for files for which you have an entry, if they are located under the main file directory. To change the main file directory: 
Options -> Preferences -> External programs -> Main file directory
But the main file directory can also be specified for each bibtex database under:
File -> Database properties -> General file directory and PDF directory

Friday, April 11, 2014

Creating PDF reports with R on Ubuntu

Texi2pdf

Texi2pdf is a function from the tools package that Compiles LaTeX Files into PDFs.

Using the R command:
> texi2pdf("docs/rapports/draft/template2.tex")
Prompted the error:
Error in texi2dvi(file = file, pdf = TRUE, clean = clean, quiet = quiet,  :
  Running 'texi2dvi' on 'docs/rapports/draft/template2.tex' failed.
Messages:
sh: 1: /usr/bin/texi2dvi: not found

Installing texlive and texinfo fixed this error.
sudo apt-get install texinfo
sudo apt-get install texlive
For info the source of the texi2dvi bash script was mentioned by this blogger.

Accents

There was an issue with accents not rendered.
Loading this package fixes it:
 \usepackage[utf8]{inputenc}

devtools

 opts_knit$get() showed me options that don't exist any-more in the current version of the knitr package. I wanted to install the latest version of knitr.
I needed the package devtools.
But I couldn't install devtools because of this message
Cannot find curl-config
As explained in this mail, installing the package  "libcurl4-gnutls-dev" fixes this. I could then install the package devtools and load it.

To install the latest version of knitr:
library(devtools)
install_github(repo = "knitr", username = "yihui")

xtable

The xtable galery explains how to do longtable and tables in landscape format.It also demonstrates how to rotate column names and how to print a table of linear model coefficients.

Wednesday, March 26, 2014

Git config on GNU-Linux

Colors

To add coloration:
git config --global color.ui true

SSH

Connect to a remote repository without using the password (ssh key). On Bitbucket, different connection protocols have different repository URL formats. Create a private and public ssh key pair with for example the seahorse key manager. Then add the public key to bitbucket under manage account / security / SSH keys.

Moving from https to ssh url format 

My previous connection was through https. You can skip this command if you start with a fresh repository.  Rename the remote "origin" to "originhttps".
git remote rename origin originhttps
Just wanted to be able to connect through https in case the ssh connection doesn't work. This remote tracking branch can be deleted later with:
git remote remove originhttps

Tell git to use a ssh:// remote

Then tell git to use a remote (the "ssh://" protocol name is optional as shown in this how-to):
git add remote origin ssh://git@bitbucket.org/accountname/reponame.git
Fetch the new remote tracking branch into the local repository. (It will probably not load anything new but is required to set this branch as upstream later):
git fetch origin master

Set upstream branch

The commands "Git pull" and "git push" still connected to the https url (which prompts for a password). I had to set the new remote as the default for my master branch:
git branch --set-upstream-to=origin/master
List remote branches (for information)

git branch -r

The repository is now ready to fetch, merge and pull on remote origin without entering the password.


See also my post on git commands.

Thursday, March 13, 2014

Regular Expression


Rstudio REGEX
Wanted to replace # at the end of the line. So that they don't appear in the code navigator. $ indicates the end of a line in a regular expression. 
Replaced #######$ by ####### # .


Friday, January 24, 2014

Add a table of content to HTML files generated from R Markdown

Update November 2014

With the new version of knitr and Rmarkdown, the custom function is not necessary anymore. One can add a yaml at the beginning of a Rmd file:
---
title: "Development scrap concerning the input data"
output:
  html_document:
    toc: true
---



Old content from January 2014

Knitr creator Yihui explained in a comment on this forum how to add a table of content to a Rmd file using the knit2html() function:
library(knitr)
knit2html('docs/clean.example.Rmd', options = c('toc', markdown::markdownHTMLOptions(TRUE)))
I followed the RSTUDIO advice on how to customize markdown rendering.
A .Rprofile  at the root of my project directory with the following content does the tric:
options(rstudio.markdownToHTML =
  function(inputFile, outputFile) {     
    require(markdown)
    htmlOptions <- defaults="TRUE)<br" markdownhtmloptions="">    htmlOptions <- br="" c="" htmloptions="" toc="">    markdownToHTML(inputFile, outputFile, options = htmlOptions)
  }
)
I works, I can now use the RStudio button "knit html" or the shortcut CTRL+SHIFT+H and get an html file that includes a table of content!

Thursday, January 23, 2014

GFPM

Problem: In Windows 7, I was getting UAC message for each .exe component of the GFPM model.
Fix: stop annoying UAC prompts, recommends to use the task scheduler and set highest priviledge for the task. I used it to run a GFPM task. Then created a shortcut to:
C:\Windows\System32\schtasks.exe /RUN /TN GFPM\GFPM
I called this Shortcut "Run GFPM without UAC warning". It seems to work, there were no UAC warnings anymore!

A next step would be to run GFPM under wine in linux. That would make it easier to run more than one simulation. From wine it's not going to be possible to use Excel, so I might have to start the batchfiles after the call to world.xls.

Tuesday, January 21, 2014

R commands

A list of commonly used R commands.

Remove all objects from the workspace:
rm(list=ls())

Yihui Xie wrote that "setwd() is bad, dirty, ugly." Use relative paths instead.

 

Testthat library

Run all tests in a directory:
test_dir("tests")

Wednesday, January 15, 2014

Install R Packages

I recently participated in a training on the use of R to extract data from permanent forestry plots. These is how to install the R packages used in that training:
install.packages(c("doBy", "reshape2", "ggplot2", "GISTools", "lattice", "gstat", "knitr", "raster", "xtable", "rgdal"))
Somehow it didn't install all dependencies for ggplot2, I needed to run:
install.packages('ggplot2', dep = TRUE)

Monday, January 13, 2014

Msysgit makes GNU Bash available on Windows

Msysgit makes Bash available on windows. It is related to MinGW - Minimalist GNU for Windows, itself a fork of cygwin. The version I use on my system is msysgit. With this tool, some linux like command line operations can be run on windows.

 MSYS Git FAQ:
"MSys is an environment for Windows offering a Unix-type shell and a Perl interpreter. Because many parts of Git are still not builtins programmed in C, but instead shell and Perl scripts, Git for Windows needs such an environment. Therefore we ship Git for Windows with a very minimal version of MSys."
Example of commands:
Compute md5sum
md5sum filename
See also:

Thursday, January 02, 2014

Ipython notebook

Start server available on local network:
ipython notebook --ip=192.168.xxx.xxx

GNU-Linux bash shell commands


Linux is the kernel of the operating system on top of which other programs are built. A detailed list of GNU core utilities is available under the command :
info coreutils

 Files 

determine file type and encoding
file filename
list a directory
ls
ls -R #list subdirectories recursively
ls -lh #sizes in human readable format
Find files in subdirectories of the current directory (Quotes are requited to prevent shell command expansion).
find . -name "*.pdf"
find . -mtime 0 # modified in the last 24 hours
Find files in the whole system
locate filename

File and folder compression

Decompress a file
 gunzip file.gz
How do I compress a whole directory?
tar -zcvf archive-name.tar.gz directory-name
Where
  • -z: Compress archive using gzip program
  • -c: Create archive
  • -v: Verbose i.e display progress while creating archive
  • -f: Archive File name
To extract content from the archive in the current directory
tar -zxvf archive-name.tar.gz

Rename files

For example to rename all upper-case .JPG extension into lower-case .jpg extension.
rename 's/\.JPG$/\.jpg/' *.JPG
Change file permission:
chmod a=rwx filename
chmod 777 filename 
Change file permissions recursively:
chmod 755 directoryname
Chmod instructions can be given with characters or numbers, chmod 777 or chmod a=rwx is a question of preference.
  • Some prefer 755 over 777 because giving write access to group and other users could be a security risk. 755 leaves read and execute rights to groups and other users. 755 is visible as "rwxr-xr-x" in ls -l. 
  • The default for document files on Debian seems to be chmod 644, visible as "-rw-r--r--" in ls -l.

Text files

Count the number of lines in a file
wc -l filename.txt
Count occurrences of a word in a file
grep -roh word filename.txt  | wc -w
Remove duplicated lines from a file
awk '!a[$0]++' input.txt
Search with Grep
 grep "text" file.txt
Awk tutorial, for example  filter a large file for lines that have a third field (product code) starting with 44, keep the header line:
awk -F, '$3 ~ /^44/||NR==1' nc201501.dat|less
Regexp match begining of and end of line with ^ and $.

Follow the end of a log file as it is written 
tail -f
See tab and end of line characters in a text file
cat -te filename |less

Manipulate strings in files

Replace strings
first="I love Suzy and Mary"
second="Sara"
first=${first/Suzy/$second}
Replace strings with sed

sed -i  's/pattern/replacement/g' bli.txt
sed -i  's/^.*\://g' input_file.txt # edit file in place
grep EMAIL input_file.txt |sed  's/^.*\://g' > output_file.txt
Replace strings with perl in a git repository

git grep -lz 'readcsvfromgauss'| xargs -0 perl -i'' -pE "s/readcsvfromgauss/readcsvfromgauss0/g"

PDF files

Commands based on the poppler library for PDF manipulation.
Search a text pattern in all PDF files present in a directory:
pdfgrep pattern *.pdf
 Merge multiple PDF into one:
pdfunite in-1.pdf in-2.pdf in-n.pdf out.pdf
Alternatively, pdftk can be used to merge PDF files
pdftk input1.pdf input2.pdf cat output output.pdf

Videos and audio

Install youtube-dl using pip:
 sudo pip install --upgrade youtube_dl
Download a video from youtube :
youtube-dl video_url
Download only the audio in an .mp3 format
youtube-dl --extract-audio --audio-format mp3 video_url

Users

Check your user id
id 
What group do you belong to as a user
groups
Add a user to the super users
adduser username sudo
That user needs to re-log into the shell for the change to take effect.
Add a new user
useradd username
Set a password for the new user
passwd username
Delete a user
userdel username
Show all users
getent passwd
Show all groups
getent group

System

OS release
less /etc/os-release
Disk usage
du -h
Display available space on drives
df -h
Display available RAM memory
less /proc/meminfo

Install a program
sudo apt-get install
 System name
uname -a
file /sbin/init
hostname -f
Start and quit a super user session
su
exit
Last time the system was started
last reboot 
last
Show environment variables
printenv

Job handling

List
jobs
Bring a job to the foreground
fg job_number
Run a job in the background. A command followed by an & will run in the background.

Stop a job
CTRL ^ Z
Quit a job
 CTRL ^ C
Kill a malfunctionning program:
kill process_id
Find a program id with:
ps aux
Kill a graphical program, by clicking on it:
 xkill

Users

Create a new user
adduser user_name
Temporary log in as that user
su user_name
Delete a user
userdel user_name

Secure shell

log into a remote machine
ssh user@remote_machine
Copy a local file to a file on the remote machine
scp local_file_name user@remote_machine:path_to_file/file_name
Copy a file from the remote machine to a local file
scp user@remote_machine:path_to_file/file_name  local_file_name
Copy a full directory (dmouraty) from the remote machine:
 scp -rp user@dest:/path destdirectory

Alias

alias ll="ls -lh"

Based on how can i sort du-h output by size
alias du='du -hd1 | sort -h -r'

You can place those commands in your ~/.bashrc to create a permanent alias.
bashrc:
"You may want to put all your additions into a separate file like ~/.bash_aliases, instead of adding them here directly."

.bash_profile and .bashrc

These are places where a user can turn of the system BEEP :
setterm -blength 0
.bash_profile is executed on login shell, when you login in another tty or when you access a system through ssh. .bashrc is executed on non-login shells when you open a terminal window in Gnome.

Debian Dotfiles
"Now, since bash is being invoked as a login shell (with name "-bash", a special ancient hack), it reads /etc/profile first. Then it looks in your home directory for .bash_profile, and if it finds it, it reads that."
[...] "You may have noted that .bashrc is not being read in this situation. You should therefore always have command source ~/.bashrc at the end of your .bash_profile in order to force it to be read by a login shell.  "
In .bashrc a user can set environment variables, define alias (see above).

Keyboard

bash french blogger recommended a simple shell command to change keyboard layout :
sudo loadkeys fr
fr-keyboard on Debian wiki for a more permanent system configuration and use in GUI apps. Switching between keyboads can then be done with:
setxkbmap de
setxkbmap fr

Information about the system

  • cat /proc/meminfo
  • cat /proc/cpuinfo
  • cat /etc/debian_version
  • lsb_release -a

Shortcuts

Keyboard shortcuts for bash  for example Ctrl+A to go to the beginning of a line.

Documentation