class: center, middle, inverse, title-slide #
PPOL561 | Accelerated Statistics for Public Policy II
Week 1
Course Introduction
&
Research Design
###
Prof. Eric Dunford ◆ Georgetown University ◆ McCourt School of Public Policy ◆
eric.dunford@georgetown.edu
--- layout: true <div class="slide-footer"><span> PPOL561 | Accelerated Statistics for Public Policy II           Week 1 <!-- Week of the Footer Here -->              Introduction & Research <!-- Title of the lecture here --> </span></div> --- class: outline # Outline for Today <br> <br> - **Course Overview** - **Presentations** - **Research Design** --- class: newsection # Course Overview --- ## Course Goals ![:space 3] 1. **Causality**: understand the challenge of isolating causal effects via random assignment, experiments, instrumental variables and other methods; 2. **Modeling**: understand and implement advanced statistical models, such as limited dependent variable models (binary, ordered, and multinomial outcomes), selection models, and panel models; 3. **`R` Programming**: use `R` to manipulate data and conduct advanced statistical analysis. 4. **Communication/Presentation**: effectively communicate statistical analysis and present publication-quality tables and graphics. --- ## Course Calendar <center> <table class="table" style="font-size: 17px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> Week </th> <th style="text-align:left;"> Topic </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> 1 </td> <td style="text-align:left;"> Course Introduction & Research Design </td> </tr> <tr> <td style="text-align:left;"> 2 </td> <td style="text-align:left;"> Data Wrangling & Presentation in R </td> </tr> <tr> <td style="text-align:left;"> 3 </td> <td style="text-align:left;"> Introduction to Causal Inference </td> </tr> <tr> <td style="text-align:left;"> 4 </td> <td style="text-align:left;"> OLS, Confounders, & Simulation </td> </tr> <tr> <td style="text-align:left;"> 5 </td> <td style="text-align:left;"> Panel Data & Difference-in-Difference </td> </tr> <tr> <td style="text-align:left;"> 6 </td> <td style="text-align:left;"> Instrumental Variables </td> </tr> <tr> <td style="text-align:left;"> 7 </td> <td style="text-align:left;"> Experiments </td> </tr> <tr> <td style="text-align:left;"> 8 </td> <td style="text-align:left;"> Regression Discontinuity </td> </tr> <tr> <td style="text-align:left;"> 9 </td> <td style="text-align:left;"> Midterm </td> </tr> <tr> <td style="text-align:left;"> 10 </td> <td style="text-align:left;"> Matching </td> </tr> <tr> <td style="text-align:left;"> 11 </td> <td style="text-align:left;"> Synthetic Control </td> </tr> <tr> <td style="text-align:left;"> 12 </td> <td style="text-align:left;"> Binary Outcomes </td> </tr> <tr> <td style="text-align:left;"> 13 </td> <td style="text-align:left;"> Ordered & Multinomial Outcomes </td> </tr> <tr> <td style="text-align:left;"> 14 </td> <td style="text-align:left;"> Selection </td> </tr> </tbody> </table> --- ## Course Policies ![:space 4] - The wacky world of **Attendance/Participation** in the Covid Era - **Late Assignments** - **Proof of Diligent Debugging** - **No Practice Exams** <!-- - **Sit with someone different each class** --> - **Instructional Continuity** - **Use of Class Materials** --- ## Work Load + 7 homework assignments (select 4 from 6, all do 7) + One 10-minute "group" presentation + Two 2-hour "written" exams (midterm and final) ![:space 5] .center[ | **![:text_size 7](Assignment)** | **![:text_size 7](Percentage)** | |----------------|----------------| | ![:text_size 6](Attendance) | ![:text_size 6](5%) | | ![:text_size 6](Presentation) | ![:text_size 6](10%) | | ![:text_size 6](Problem sets) | ![:text_size 6](25%) | | ![:text_size 6](Midterm exam) | ![:text_size 6](30%) | | ![:text_size 6](Final exam) | ![:text_size 6](30%) | ] --- ## Succeeding in this course <br><br> - **Come Prepared** - **Ask Questions** - **Collaborate** - **Start Homeworks Early** - **Utilize the Teaching Assistant** --- ## Concerns - **_![:text_color darkred](Different backgrounds)_** + Some have a lot of previous methods exposure; others less so. + We all have different substantive interests - **_![:text_color darkred](Going too fast)_** + Slow me down + Ask questions + I will re-schedule the course calendar if we need time - **_![:text_color darkred](Being called on)_** + It'll happen. Read, follow along with the lecture, and you'll be fine + **_![:text_color darkred](Presenting)_** --- class:newsection # Presentations --- ## General - **_Randomly paired_** with another student; presentation date also randomly assigned. - **_10 minute, in-class presentation with slides_** on a paper related to the material we are discussing. - Presentations will be given at the **_start of class_** each week. + Presenters should set up their slides prior to the start of class. + First presentation is on February 16th (and every week thereafter, except for the week of the midterm & Spring Break) - **![:text_color darkred](Email me a copy of the presentation the _day before_ your scheduled presentation time)** - [**Presentation Rubric**](http://ericdunford.com/ppol561/Assignments/Presentations/presentation_rubric_PPOL561.pdf) --- ## Presentation Schedule <center> <table class="table" style="font-size: 24px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:center;"> Group </th> <th style="text-align:center;"> Week </th> <th style="text-align:center;"> Partner_1 </th> <th style="text-align:center;"> Partner_2 </th> </tr> </thead> <tbody> <tr> <td style="text-align:center;"> 1 </td> <td style="text-align:center;"> 4 </td> <td style="text-align:center;"> Ella Zhang </td> <td style="text-align:center;"> Sahithi Adari </td> </tr> <tr> <td style="text-align:center;"> 2 </td> <td style="text-align:center;"> 5 </td> <td style="text-align:center;"> Fangzi Wang </td> <td style="text-align:center;"> Maryam Khalid Shah </td> </tr> <tr> <td style="text-align:center;"> 3 </td> <td style="text-align:center;"> 6 </td> <td style="text-align:center;"> Tianhui Cao </td> <td style="text-align:center;"> Madeline Kinnaird </td> </tr> <tr> <td style="text-align:center;"> 4 </td> <td style="text-align:center;"> 7 </td> <td style="text-align:center;"> Alexander Adams </td> <td style="text-align:center;"> Harshini Tammareddy </td> </tr> <tr> <td style="text-align:center;"> 5 </td> <td style="text-align:center;"> 8 </td> <td style="text-align:center;"> Merykokeb Belay </td> <td style="text-align:center;"> Lexi Gu </td> </tr> <tr> <td style="text-align:center;"> 6 </td> <td style="text-align:center;"> 10 </td> <td style="text-align:center;"> Charlie Zhang </td> <td style="text-align:center;"> Gloria Li </td> </tr> <tr> <td style="text-align:center;"> 7 </td> <td style="text-align:center;"> 11 </td> <td style="text-align:center;"> Mary Kryslette Bunyi </td> <td style="text-align:center;"> Matthew Ring </td> </tr> <tr> <td style="text-align:center;"> 8 </td> <td style="text-align:center;"> 12 </td> <td style="text-align:center;"> Yousuf Abdelfatah </td> <td style="text-align:center;"> Zixun Hao </td> </tr> <tr> <td style="text-align:center;"> 9 </td> <td style="text-align:center;"> 13 </td> <td style="text-align:center;"> Justine Huynh </td> <td style="text-align:center;"> Vince Egalla </td> </tr> <tr> <td style="text-align:center;"> 10 </td> <td style="text-align:center;"> 14 </td> <td style="text-align:center;"> Ruyi Yang </td> <td style="text-align:center;"> Chau Nguyen </td> </tr> </tbody> </table> --- ## Main things to hit ![:space 5] - What is the issue and why is it **interesting**? - Why is it **hard to answer** the question? - What are the **inferential** challenges? - How did this paper **meet the challenge** (posed by the inferential challenges)? (Many studies don't.) - What did they **find**? - What do these findings **tell us** that we didn't already know? --- ![:space 7] - **Be interesting** + Do **_![:text_color darkred](not)_** read notes + Don't pack too much in + Start strong with a “hook” and with interesting context <br> - **Know your audience** + They have advanced statistical knowledge + But they have not read the paper --- ![:space 7] - **Be Concise** + You can't cover everything so don't try + Do not show huge tables of numbers → in fact, don't copy and paste any tables into your slide. + Generate compelling visuals. + If something is on a slide, explain it clearly. (Otherwise cut it!) - **Be Ready** + Practice out-loud at least twice (preferably to someone who hasn’t read the paper) + 10 minutes – not more, not less + Prepare for questions --- ## Critiques require context An effective critique includes an explanation of how your point would change results <br> - **![:text_color darkred](Bad)**: > There are too many variables! - **![:text_color darkgreen](Better)**: > Multicollinearity of variable `\(X_1\)` and `\(X_2\)` could increase the variance of `\(\beta_1\)` and could explain its statistical insignificance --- ## Critiques require context An effective critique includes an explanation of how your point would change results <br> - **![:text_color darkred](Bad)**: > They didn’t include a variable for `\(Z\)` - **![:text_color darkgreen](Better)**: > `\(Z\)` was omitted and we believe that excluding it matters as `\(Z\)` is likely correlated with variable `\(X\)`, making us wonder if omitted variable bias has distorted the estimate of the effect of `\(X\)` on `\(Y\)`. --- ## Critiques require context An effective critique includes an explanation of how your point would change results - **![:text_color darkred](Bad)**: > The study was only of California. - **![:text_color darkgreen](Better, but still bad)**: > The study was only of California, which is a very diverse state. - **![:text_color darkgreen](Better)**: > I’m concerned about the generalizability of the findings because I expect the coefficient on X to be different in a diverse state such as California due to … --- ## Critiques require context ![:space 7] - Better to have 1 or 2 critique well explained than many poorly explained - It's okay to piggy-back on some of the critiques addressed in subsequent publications; but you need to have some original content. - Be clear and explicit about why you think your proposed adjustment or issue pointed out might change the outcome of the analysis. - Be fair + Avoid "collect more data"-type critiques. --- ## Tips - **_Start preparing early_** - **_Coordinate_** & **_communicate_** with your partner - Hunt for an **_interesting paper_** (makes the exercise far more enjoyable) + Have a paper you already think is really interesting? Look at other papers in that journal. Georgetown subscribes to hundreds of academic journals. + Check out media/non-academic publications for features on academic work, such as the [Monkey Cage](https://www.washingtonpost.com/news/monkey-cage/). These articles will direct you to political science/public policy related work. - Keep in mind the paper must be a **_peer-reviewed_**, empirical article; ideally using one or more of the methods covered in this class --- class: newsection # Research Design --- ![:space 5] .pull-left[ <br> <br> - **Inference**: a belief based on evidence _and_ rules for processing that evidence <br> - **Methodology**: tools for gathering and analyzing data to make valid inferences ] .pull-right[
] --- ###Two Types of Inference ![:space 5] **Descriptive Inference** → What are the facts? + Is the climate changing? + Is the United States politically polarized? + Is global terrorism increasing? + Is Azerbaijan a democracy? -- **Causal Inference** → Why does something occur? - _Why_ is the climate changing? - _Why_ is the United States politically polarized? - _Why_ is (or is not) global terrorism increasing? - _Why_ is (or is not) Azerbaijan a democracy? --- ### A Causal Research Question Typically start with either - (1) an **outcome** (dependent variable) - If what, then `\(Y\)`? - What _causes_ `\(Y\)`? - Associated with a search for causes, e.g. what causes climate change? -- - (2) a **cause** (independent variable) - If `\(X\)`, then what? - What happens to `\(Y\)` if `\(X\)` goes up or down? - Associated with “experimentation”; e.g. What happens if we release greenhouse gases into the air? --- ### Which of these is a causal research question? ![:space 10] - Will Democrats win the next U.S. presidential election? - What factors increased the likelihood of Hillary Clinton's defeat in the 2016 election? - How has electoral performance for the Republican party changed over the last three decades? - What was the result of the last U.S. presidential election? - What role did the economy have on the last U.S. presidential election? --- ### Which of these is a causal research question? ![:space 10] - ![:text_color lightgrey](Will Democrats win the next UK general election?) - What factors increased the likelihood of Hillary Clinton's defeat in the 2016 election? - ![:text_color lightgrey](How has electoral performance for the Republican party changed over the last three decades?) - ![:text_color lightgrey](What was the result of the last U.S. presidential election?) - What role did the economy have on the last U.S. presidential election? --- ### Falsifiability One of the tenets behind the scientific method is that any scientific hypothesis and resultant experimental design must be inherently _falsifiable_. -- **Falsifiability** is the assertion that for any hypothesis to have credence, it must be inherently <u>disprovable</u> before it can become accepted as a scientific hypothesis or theory. -- .center[**![:text_color darkred](Unfalsifiable)**] .center[![:text_color darkred](Can _never_ shown to be false)] .center[**![:text_color darkgreen](Falsifiable)**] .center[![:text_color darkgreen](Can be shown to be false.)] --- ### Falsifiability One of the tenets behind the scientific method is that any scientific hypothesis and resultant experimental design must be inherently _falsifiable_. **Falsifiability** is the assertion that for any hypothesis to have credence, it must be inherently <u>disprovable</u> before it can become accepted as a scientific hypothesis or theory. .center[**![:text_color darkred](Unfalsifiable)**] > ![:text_color darkred](Social welfare spending by the government sometimes impacts poverty in a country.) .center[**![:text_color darkgreen](Falsifiable)**] > ![:text_color darkgreen](_Increased levels_ of social welfare spending by the government _reduces_ the average level of poverty in the country.) --- # Good research questions ### 1. Start from political problem or puzzle - Something important - Not obvious -- ### 2. Builds on an existing research literature/Knowledge - "Stand on the shoulders of giants" - Don't reinvent the wheel -- ### 3. Falsifiable --- ## Generating Questions ### Puzzle-driven - Given what we know, we expect A but observe B. How interesting! -- ### Theory-driven - Deduction -- ### Data/Method-driven - Induction -- ### "Should"-driven - normative --- ## Generating Theories ![:space 10] ![:center_img](Figures/theory_analysis.png) --- ## Generating Theories ### Approach 1: Reason _Inductively_ - Induction works by drawing generalities from specific observations - Sometimes called “bottom-up” theorizing -- <br> ### Approach 2: Reason _Deductively_ - Deduction begins from general, assumed principles/axioms to reach more specific observable realities - e.g. Rational choice theory --- ## Generating Theories What makes for a **_good theory_**: - **Truth**: Maps to reality - **Relevance**: Matters - **Coherence**: Is clear - **Falsifiability**: Can be disproven - **Precision**: Captures the concepts of interest - **Generality**: Theories that can explain more are preferred over theories that can explain less - **Parsimony**: Simple theories are preferred over complex theories --- ## Generating Hypotheses - What would evidence consistent/inconsistent with the theory be? - Think about **_observable_ implications** derived from the theory - Use causal ("if-then") logic - **_Falsifiable_** - Avoid **_Observational Equivalence_** - All hypotheses for two (or more) theories are identical - Different drivers yield the same results - Get around this by establishing clear **_scope conditions_**; reformulating question/theory/concept --- ### (_Social_) Scientific Method ![:space 2] .center[ **Research question(s)** **Clarify the core concepts** **Develop (causal) theory (_with clear scope conditions_)** **Derive specific, testable hypotheses** **Plan data collection** **Gather data/evidence** **Analyze data** **Draw inferences** ]