Machine Learning for Hackers with Debian and Ubuntu


Data Science and Machine Learning are hot topics at the moment. Many people are considering how to extend their skills into these areas and many solutions have appeared, including full online degrees, free online courses combined with free software and for those who prefer hard copy, a staggering choice of books on the topic.

One of those books is O'Reilly's Machine Learning for Hackers by John Myles White and Drew Conway. The book uses R to demonstrate a series of techniques for analysis and prediction. The book offers a great opportunity to simultaneously get an introduction to basic machine learning techniques and also an introduction to the increasingly popular R platform.

On page 11 they list all the major R packages needed to run their examples (available on Github).

I had a look over this list to see how many could be installed on a Debian system using apt-get and found that about half of them were already present. Five of them and one dependency, however, were not already available so I've whipped up packages for them and they are now in jessie-backports for all users of the current stable release.

If you are following the exercises in this book, you can get all the software you need with one convenient command:

$ sudo apt-get install -t jessie-backports \
  r-cran-ggplot2 r-cran-lme4 r-cran-rcurl \
  r-cran-reshape r-cran-xml r-cran-arm \
  r-cran-glmnet r-cran-igraph r-cran-lubridate \
  r-cran-rjsonio r-cran-tm

Thanks to all those who already packaged other parts of R and backported the relevant packages.

Note that the RJSONIO package's authors have not provided a valid free software license so it is in non-free. It is there to support people using the book but I would encourage people to use RJSON for any new projects as it does have a valid license.