Review of Module 9 of Data Analysis for Social Scientists (MITx, edX) – Practical Issues in Running Regressions, and Omitted Variable Bias
It has been challenging to fully understand the technical concepts taught in this course, as well as use R to complete the homework, given my intense workload. So, I have settled for understanding the general ideas, and I hope to revisit this (or a similar course) again in the future. Anyway, since I have already gained enough credit to pass, there is no need to work too hard 😛 (joking!)
A random selection of what I learnt this week…
Most statistical packages will provide the F-test (all coefficients = 0) and t-test (individual coefficients = 0). The standard error is also provided – this allows the confidence interval of the coefficient to be constructed.
Prof Esther discussed some practical issues in running regressions, including regressions with categorical variables, and interaction effects. With her examples I got a much better feel of how to interpret linear regressions.
It is possible to use a linear regression framework when the relationship between the independent and dependent variable is non linear. For example, polynomial models can be used to transform non-linear relationships. Regression discontinuities were also discussed.
Finally, omitted variable bias occurs when a model created incorrectly leaves out one or more important factors. The “bias” is created when the model compensates for the missing factor by over- or underestimating the effect of one of the other factors (Source: Wikipedia).