Friday, January 06, 2012

googlecli rediscovered

Yesterday, I rediscovered googlecl, a command line tools for the Google Data APIs. I had it in my delicious bookmarks, but had forgotten about it. I think it might prove useful to edit google documents with emacs when at my computer (with google docs edit "mydoc" --editor emacs) and the web interface when on the road.

And by the way, this post has been uploaded with google blogger post --tags "google, emacs, cli" --title "googlecli rediscovered" post.html.

Thursday, January 05, 2012

Bibtex database management - there is an emacs mode for that

Bibliographic management has always been a sensible issue. My only requirement is that whatever I use must support bibtex for optimal LaTeX experience. I use/have used JabRef, referencer, CiteULike and tested Mendeley (on Linux), but have never been totally convinced. Today, I discovered Ebib, which looks like a very promising candidate for at least some of my requirements. It seems that it misses an DOI or PMID [1] import feature, though.

[1] update: here it is, PubMode - A PubMed interface for Emacs :-)

Monday, November 07, 2011

Finally printing on our Epson Aculaser C1100 printer

Following Marc Higgins advices on this ubuntu forum thread (January 2nd, 2010 post):

$ wget http://000it.com/files/epson_c1100/epson_c1100_install.tar.gz
$ tar -zxvf epson_c1100_install.tar.gz

$ sudo dpkg -i *.deb

Wednesday, October 26, 2011

Open Source Licenses

Although getting into the gory details about software licensing is probably out of reach (and scope) for those of us who are not lawyers that specialised in that topic, it is always good to have some basic understanding of licenses when writing and distributing programs. Here are two interesting reads by Ed Burnette posted in 2006 on ZDNet: How to pick an open source license part 1 and part 2. The Open Source Initiative is also a good source of information.

These sources are most helpful when you use some third party free/open code and thus need to make sure that your software's license is compatible with its dependencies'.



Tuesday, October 25, 2011

Using an R session from a different emacs process

Here is my use case. Imagine you are running one (or several) R session within emacs - many buffers open, with the R session(s), script and other files. You want to open another R code file, somewhat related to you current work in progress, but not open yet another buffer. You want to open that other file in another emacs process, and then connect to the R session in the first emacs instance. Does that make sense...

In brief, (1) start an emacs server in your first process and (2) use the emacsclient to connect to the server's R process. (1) is done by typing M-x server-start and (2)  by opening a new emacs client process with emacsclient -nw and connect to the R session with M-x ess-request-process (possibly specifying the session if several are available).

This can also been done from different machines, although I have not investigated further.

Friday, March 04, 2011

Postdoc's carreers, postdoc's laments... my 2 cents

An academic career is quite a bit of an odd path. It has some great unique aspects that come with a great deal of risks, incertitude and concessions (see for instance 'Is doing a PhD waste of time', 'Goodbye academia, I get a life', 'How not to succeed in Academia' for some recent examples). Jennifer Rohn suggests to get postdocs a real career path. Derek Lowe does not agree so.
I think part of the problem could be tackled earlier:


  1.  Value non-university studies. There are a lot of great opportunities out there for people who do not fancy university studies. Not going to university is not a failure, it's a choice. Studying and passing exams is not what suites every body. Just as being manually or artistically gifted is not equally shared among all of us.
  2. For those who do make the choice and graduate, show that there are other alternatives than staying at home in your university and do a PhD. These alternatives are at least as great, as interesting, as challenging. Doing a PhD is not what the 'best' grads do; its one of several opportunities, it's a choice. Academic authorities have a responsibility in this, as highlighted by Derek Lowe.
I think that this is a possible way to self-regulate the number of PhD students and Postdocs. 

Monday, February 07, 2011

R -- feeling at home once again

Today, I once again stumbled on one of these unexpected (well, at least before understanding what's really going on) behaviours that make you feel at home using R. You know, these little things that make you dig deeper to understand what's going on. I thought I would write about this one, as a little souvenir.
I was working on MSnbase, when one of the commands that show()ed the content of a reasonable large object took unreasonably long time to be printed. Basically, I needed to print the range of a list of numerics. The issue can be illustrated like this:
> set.seed(123)
> l1 <- replicate(20,rnorm(10000),simplify=FALSE)
> length(l1)
[1] 20
> sapply(l1,length)
[1] 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 10000 
[13] 10000 10000 10000 10000 10000 10000 10000 10000 

My code was supposed to calculate the range of the list. As it is above, it is straight forward to efficiently get the expected result with
> range(l1)
[1] -4.382098  4.322815
> system.time(range(l1))
   user  system elapsed
  0.004   0.000   0.006

However, in my case, the list was named:
> l2 <- l1
> names(l2) <- paste("X",1:length(l1),sep="")
> all.equal(l1,l2)
[1] "names for current but not for target"
> range(l1)==range(l2)
[1] TRUE TRUE
## but...
> system.time(range(l2))
   user  system elapsed
   0.12    0.00    0.12

Even though it now seems obvious, it was not the first time I came this. I realised that my range on named list was taking a lot of time, and that range(sapply(l2,range)) was much faster. Rprof indicated that it was the c primitive that did account for the big different in time. Eventually, it was when looking at the range.default code that this got clear:
> range.default
function (..., na.rm = FALSE, finite = FALSE)
{
    x <- c(..., recursive = TRUE)
    if (is.numeric(x)) {
        if (finite)
            x <- x[is.finite(x)]
        else if (na.rm)
            x <- x[!is.na(x)]
        return(c(min(x), max(x)))
    }
    c(min(x, na.rm = na.rm), max(x, na.rm = na.rm))
}

The line that recursively concatenates the ... arguments transforms the list in a numeric and names all individual elements, thus generating a numeric() of length 20 times 10000, which takes way too much time to be applicable on larger lists.
> nm <- names(c(l2,recursive=TRUE))
> head(nm)
[1] "X11" "X12" "X13" "X14" "X15" "X16"
> tail(nm)
[1] "X209995" "X209996" "X209997" "X209998" "X209999" "X2010000"
## and...
> system.time(c(l2,recursive=TRUE))
   user  system elapsed
  0.108   0.000   0.109
> system.time(c(l1,recursive=TRUE))
   user  system elapsed
  0.004   0.000   0.001

My empirical solution of running range on each element of the list through sapply did run much faster because only 10000 element had to be named 20 + 1 times. Now, I know why and that I better first un-name the list before calling range on it.