Before switching to industry, I taught graduate-level computational social science courses at Georgetown University. As an instructor, I try to balance substance with methodological rigor by training students how to effectively employ computational methods to investigate, analyze, and learn from data to formulate and test theoretically-relevant hypotheses. In my instruction, I match formal computational training with hands-on empirical examples so that quantitative methods are taught in the context where they are applied.

These courses aimed to train students on how to: (i) utilize machine learning methods to explore and generate hypotheses from data; (ii) design and implement statistical designs geared toward effectively inferring causal relationships from observational and experimental data; (iii) synthesize disparate and unstructured data to draw meaningful insights from data related to public policy and political science inquiries; and (iv) visualize data to effectively communicate empirical findings. My goal was to train students to be effective consumers, critics, and producers of computational social science.

Course Catalog

Accelerated Statistics for Public Policy II (PPOL561)

This is the second course in the two-course sequence on quantitative methods for social science for the Masters of Science in Data Science for Public Policy (DSPP). The course builds on students’ understanding of multivariate regression and introduces advanced, but commonly used, methods of statistical analysis. The course is broadly divided into two part: advanced modeling and causal inference. Instruction will concentrate on how to determine the appropriate econometric approach in addressing various types of policy questions, while highlighting the challenges in isolating causal effects. The emphasis is on applied learning; formal proofs and mathematical rigor are presented but not the principal focus of the course. As part of our effort to teach effective communication skills, students will make presentations about applications using the techniques being studied in class.

Data Science I: Foundations (PPOL564)

This first course in the core data science sequence for the Masters of Science in Data Science for Public Policy (DSPP) introduces students to the programming and mathematical concepts that underpin statistical learning. The aim of the course is to provide DSPP students with the foundations necessary to grasp the concepts and algorithms encountered in Data Science II and III. Students will cover topics related to linear algebra (with a focus on linear regression and dimension reduction); multivariate calculus (with an emphasis on optimization algorithms, specifically gradient descent); and probability theory (with an emphasis on simulation and sampling). Throughout the course, students will be introduced to the fundamentals of programming and manipulating data in Python. Students will work in Jupyter notebooks and use Git/GitHub to submit coding assignments, developing literate programming and reproducible research skills they will use throughout the program.

A prior (more advanced) version of this course can be found here.

Introduction to Data Science (PPOL670)

This course teaches Masters of Public Policy (MPP) students how to synthesize disparate, possibly unstructured data in order to draw meaningful insights from data. Topics covered include fundamentals of functional programming in R, literate programming, data wrangling, data visualization, data extraction (via web scraping and APIs), text analysis, and machine learning methods. In addition, students will be exposed to Git and Github for reproducible research. The course aims to offer students a practical toolkit for data exploration. The objective of the course is to equip MPP students with the skills to incorporate data into their decision-making and analysis.