On learning to “code”
Debates about the importance of programming (which, in this context, is often symptomatically called “coding,” in order to emphasize its mysterious character) have been a regular feature of digital-humanities discourse for quite a while. Programming has been alleged to be the deciding criterion for membership in the DH communion, the essential skill for any future humanist, the cruel stratagem by which the most privileged shore up their power, the avant-garde of technological solutionism’s invading force, and so on. The worst thing such discourse does is that it makes the acquisition of a skill into a test of essence.
In this course, programming is just that, a skill. It is most nearly akin to your skills in written composition: in fact, programming is a practice of disciplined expression whose main purpose is to explain how to figure something out. That is not so different from essay-writing, and I hope you will find that the kinds of reasoning you use when you program both draw on and contribute to the reasoning you use in your scholarly argumentation.
We will spend more than half our time in this course developing the skill of programming. Learning to program is a good way to get a grasp of the digital in scholarship and in contemporary life. It opens a window into how computers work or fail to work. And it multiplies your modes of creating and interacting in the digital realm.
Programming languages are legion, and debates about their relative merits exemplify all the worst traits of status competitions in technoculture. In fact, it is a fundamental fact about computation (the Church-Turing thesis) that any programming language with certain basic features is able to perform any computation that any other such language can. Different languages make some things easier and some things harder, but they are all, on a deep level, equivalent.
I have chosen R for this course because it is a language designed for analyzing data. R is a good idiom for thinking about data, especially data that can be cast into the form of a table or multiple tables. R is also complemented by a nicely comprehensive development ecosystem—that is to say, the excellent RStudio program for interacting with R, and the huge, ever-growing repository of packaged R software for data analysis and visualization, CRAN.
R is not the simplest idiom for an introduction to algorithms and data structures in general, nor is it well-suited to human-interface programming, nor is it adapted for high-performance, large-scale computation, nor is it the best window onto the lowest levels of systems programming. R is also notorious among programmers for some of its eccentric design choices; if you do know another programming language, your expectations may be frustrated by R’s oddities. In order to practice some programming fundamentals, we will occasionally use R against the grain. But that too is a useful lesson in the flexibility of programming languages.
If you become interested in any of those other aspects of computation, you will likely learn other programming languages to pursue them. But almost all programmers testify that learning further programming languages is a much more straightforward process than learning the first one. At the end of this course, you will not be limited to R; you will be ready to learn whatever you need.
Programming in this course
I predict with absolute certainty that you will be frustrated from time to time. Learning a complex skill like programming requires sustained, deliberate practice, and it is not possible without making mistakes. Though your abilities as a humanist have prepared you well for acquiring new complex skills, our microculture—the culture of the American Ph.D. program—encourages you to internalize all problems, explaining them to yourself as a personal failure. Furthermore, you are expected to overcome obstacles on your own, as part of the long succession of tests of your personal worth which is supposed to culminate in your employment as a professor. In defiance of this ideological way of thinking, we will cultivate a communal ethos of programming and problem-solving. Our goal is shared flourishing, not a competition with winners and losers.
What does this mean in practice? First of all, seek help and support without hesitation from me and from your classmates. Collaboration is expected and required in this course. If you find yourself saying, “This doesn’t work because I’m just not good at this,” stop and deliberately reframe: “This doesn’t work because there is something amiss in the code I have written, but I have the ability to figure out what is going wrong and fix it. I can ask for help when I get stuck. If it takes time for me to figure things out, that is because I am learning something new.”1
Collaboration, is, however, a two-sided exchange. When it becomes one-sided, it loses much of its value. It is unethical, and detrimental to learning, to rely entirely on someone else to resolve all your problems. I have no wish to police you, but I am always glad to discuss any concerns about the way your collaboration is going with you individually or in groups. In the meantime, as one guideline, I expect you to rigorously document and attribute any part of your individual work, including code, that you owe to someone else.
In addition to each other, you can also look online for help. In the last few years, Stack Overflow has become the programmer’s best friend: this site amasses normally quite reliable answers to specific questions. You can use its language-specific search to restrict to questions and answers about R, and you can, of course, pose questions of your own, though sometimes the “community” can be uncharitable.
There is a small community of helpful occasional participants on the DH community’s own version of Stack Overflow, DH Answers.
We’ll talk more about this together, but for the moment, here are some initial suggestions about what to do when you try to “Knit” your R markdown and you get errors or unexpected results. The key is to isolate the problem in the shortest possible segment of your code.
Use copy and paste. Create a new R markdown file and paste in just a short segment of your original R markdown source. See if you can Knit that to PDF successfully. If it does, paste in more of your original. Keep going until you hit the error. Then see if you can figure out what causes the error in the new part. (There are fancier ways to do this using RStudio’s debugging tools, but when you’re starting out the copy-paste-and-test method is by far the best.)
Take markdown out of the equation. Run your R code only, step by step, in the console. RStudio makes this a little more convenient with its commands to “Run Lines” (in the Code menu, or use Command-Return/Control-Return [Mac/PC]) or “Run Chunk.” This effectively copies and pastes the line or chunk to the console and runs it.
Check each step of the code to see that it does what you expect. After each line, print the values of the variables you are working with to see if they are what you think.
If you are trying to process a large amount of data and getting odd results, feed in toy data. Replace your long Moby-Dick text with one sentence, or three lines, and see if things work as you expect.
Don’t forget the dangers of side effects! When you knit code, R executes your program from a clean slate, with no variables defined except the ones actually defined in your script. Sometimes one’s code runs in the console because of some variable you defined earlier and forgot about when you write it out in the script. To get the clean slate for the console, use the “Clear Workspace…” command on the “Session” menu.
If you get weird error messages from R (sometimes these appear in the knitted output PDF in place of the results you want), and you isolate the cause but still can’t fix the error, there’s never any reason not to Google it! Stack Overflow is your likeliest source both of google hits and answers. Also don’t hesitate to get in touch with the rest of the class or to e-mail me.
If all your code runs fine when you run it separately in the console as described above, and yet you are still getting errors when you knit, you may be dealing with trickier R markdown bugs. These can be harder to diagnose, and I’m happy to help. You can sometimes try to “Knit HTML” instead of PDF with more success. If you have to turn in a homework with separate code (in a
.Rfile) and commentary (in a
.mdfile) rather than R markdown, that’s okay. But if possible it’s best to stick with the literate programming paradigm of R markdown.
It may horrify you to use this corny language of self-improvement, but graduate school is full of enough assaults on your self-esteem already.↩