I've been learning a bit of statistical computing with R lately on the side from Chris Paciorek's Berkeley course. I just got introduced to knitr and it's damned sweet! It's an R package which takes a LaTeX file with embedded R, and produces a pure LaTeX file (similar to how Rails renders an .html.erb
file into a .html
file), where the resulting LaTeX file has the output of the R code. It makes it super easy to embed statistical calculations, graphs, and all the good stuff R gives you right into your TeX files. It let's you put math in your math, so you can math while you math.
I've got a little project which:
- Runs a Python script which will use Selenium to scrape a web page for 2012 NFL passing statistics.
- "Knits" a TeX file with embedded R that cleans the raw scraped data, produces a histogram of touchown passes for teams, and displays the teams with the least and greatest number of touchdowns.
- Compiles the resulting TeX file and opens the resulting PDF.
- Cleans up any temporary work files.
Here's what the pre-"knitted" LaTeX looks like with the embedded R:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
|
You can comment out the line in the factory
script that deletes the tds2012-out.tex
file if you want to see what it looks like post-knit. The resulting TeX file basically contains a ton of new commonad definitions but the meat of it is what it does with your R code. It formats and displays the R code itself, and then it displays the output of the R code. Wherever the output is a graph, you'll see \includegraphics[...]{...}
. knitr will do the R computation, render the graphics, create a figures
subdirectory and store them there for the \includegraphics
to reference. Whenever the output is simply text or mathematical expressions, you'll see the R output translated to pure LaTeX markup.
Pretty cool stuff!