The following table is a compilation of all the main online resources referenced below. For the sake of conciseness, I use a shortened reference for the different resources. We’ll utilize them selectively, but you’ll be all the better for absorbing them completely.

I’ve broken the readings up into “Required” and “Additional Resources and Suggested Materials” for each week of class. For some weeks, there are a lot of different but short readings (e.g., Week 3), and for other weeks there are fewer readings but the readings are more involve (e.g., Week 11). The readings for this course are meant to support the concepts and materials covered during lecture. I encourage students to first lightly skim the readings prior to consuming the asynchronous lecture material, and then to read/heavy-skim the readings before the synchronous lecture. This will help you better absorb the main points. It’s always easier to read when you have a sense of where you’re going.

For most weeks, the reading are readings are freely available online. If links break for any of the readings, please let the professor and/or TA know as soon as possible.


Source Site Full Source Name Short Source Reference
https://r4ds.had.co.nz R for Data Science r4ds
https://git-scm.com/book/en/v2 Pro Git progit
http://plain-text.co/ The Plain Person’s Guide to Plain Text Social Science ptext
https://swcarpentry.github.io/ Software Carpentry carp
https://clauswilke.com/dataviz/index.html Fundamentals of Data Visualization datviz
https://bookdown.org/yihui/rmarkdown/ R Markdown: The Definitive Guide rmark
https://moderndive.com/index.html ModernDive MD
https://www.tidytextmining.com/index.html TidyText TT
https://cfss.uchicago.edu/setup/git-cache-credentials/ Computing for the Social Sciences css
https://christophm.github.io/interpretable-ml-book/index.html Interpretable Machine Learning iml
http://adv-r.had.co.nz/ Advanced R advR
https://geocompr.robinlovelace.net/ Geocomputation with R geo

RStudio Cheat Sheets: while in the IDE click help > Cheatsheets

Week 1: Work Flow and Reproducibility

Week 2: Introduction to Programming in R

Week 3: Reproducibility in Practice

Week 4: Data Wrangling in R

Week 6: Web Scraping

Week 7: Geospatial Data

Week 9: Introduction to Statistical Learning

Week 10: Applications in Supervised Statistical Learning (Regression)

  • Required Readings:
    • Linear Models - James et al. Ch. 3.1 - 3.3
    • Tree-Based Methods - James et al. Ch. 8.1 - 8.2

  • Additional Resources and Suggested Materials
    • Decision Trees in R - Data Camp
    • Montgomery et. al. (2018) “Tree-Based Models for Political Science Data”. American Journal of Political Science. (See Canvas)

Week 11: Applications in Supervised Statistical Learning (Classification)

  • Required Readings:
    • K Nearest Neighbors - James et al. Ch. 2.2.3 & 3.5
    • Support Vector Machines - James et al. Ch. 9.1 - 9.4
    • (Refresh from Week 9) Tree-Based Methods

Week 12: Interpretable Machine Learning

Week 13: Applications in Unsupervised Statistical Learning

  • Required Readings:
    • Unsupervised Learning - James et al. Ch. 10.1 - 10.3