To do the work in this course, you will need to install some specialized software.
The programming language we will be learning is R. (R is both a language and an environment for analyzing data. We’ll use both aspects.)
R can be downloaded at http://cran.rstudio.com. Choose the installer for your operating system:
- Mac: http://cran.rstudio.com/bin/macosx/ (Choose the “Mavericks” version if you have OS X 10.9 or higher)
- Windows: http://cran.rstudio.com/bin/windows/base/
Incorporating code and data analysis into writing is tricky. The most convenient way to do that for our purposes will require a typesetting system called TeX. TeX is a very powerful general-purpose typesetter (and a programming language in its own right). We won’t be using it directly, but we will need the software.
Mac users: visit http://tug.org/mactex/ and download the installer.
Windows users: you have two options.
- Use the MikTeX installer at http://miktex.org/download
- or download the
.exefile linked on http://tug.org/texlive/acquire-netinstall.html and double-click. That should do it, though there are more directions on that page.
(I originally suggested TeX Live before MikTeX, but it sounds like the latter is easier to set up.)
You’ll be doing all your programming work for the course within this program. This very convenient program combines a text editor for writing R programs and an R console for entering R commands or running entire programs and seeing their output. It also makes the process of writing about code and data easier by automating the rather complicated business of running a program, incorporating its output into a written document, and typesetting the whole thing together.
Download RStudio for your operating system at: http://www.rstudio.com/products/rstudio/download/
We’ll be making use of a rather large number of R “packages” (software that extends what R can do). These are easily installed using the “Install Packages…” command in RStudio’s tools menu.
We’ll need the following packages right away:
devtools. Then type the following into the R console:
which will download some additional bits and bobs for the course. (I’ll ask you to run that command again when I update those bits and bobs.)
For a sneak preview of what it’s all about, visit http://rmarkdown.rstudio.com/.
If you’d like to get ahead of the game, here are most of the rest of the packages on our menu. Copy and paste into RStudio’s console:
install.packages(c( "ggplot2", "stringr", "lubridate", "reshape2", "XML", "jsonlite", "httr", "dplyr", "tidyr"))
I can’t promise you won’t get errors. Mac users may have to go to the App Store and install the OS X Developer Tools. More help to follow; you won’t need any of these immediately.
A virtual alternative/supplement
Though I think “native” software is rather easier to set up and use, it can be very convenient to have a setup that you are absolutely guaranteed is identical to the one your peers are talking about (including me). The way to do this is to emulate such a setup, creating a virtual machine whose configuration includes R, TeX, and RStudio in a completely controlled environment. VMs have gotten a lot less arcane, to the extent that I have been able to put something together for you using free software, Vagrant and VirtualBox.
To install the virtual environment, go to http://github.com/agoldst/litdata-vagrant and follow the instructions there.