Review of Module 5 of Data Analysis for Social Scientists (MITx, edX) – Special Distributions, the Sample Mean, the Central Limit Theorem, and Estimation
This week’s lectures and finger exercises seemed much easier to grasp than the previous few weeks (phew!). Then I got to the homework and it was disastrous. I think this is one of those modules that require lots of practice to properly understand.
Here’s a brief summary of what I learned this week.
Human subject research
Nazi human experimentation and the Tuskegee Syphilis Study raised the issue on the ethics of conducting research on human subjects. According to the Belmont report, research is defined as “any investigation conducted with the goal of creating generalisable knowledge”, which means studies conducted for internal use are not considered research. Criteria for appropriate human subject research is based on beneficence, justice and respect.
At this point I would note that many developing country institutions do not have an ethics approval board. I do not know the exact reasons why, I am sure it is complex, but it affects quality of research and ability to publish in well-known journals.
Esther went through the characteristics of Bernouilli, Binomial, Hypergeometric, Poisson, Uniform, and Exponential distributions.
Sample mean and Central Limit Theory
The sample mean is useful because it allows you to estimate characteristics of a phenomenon’s underlying distribution.
Regarding the distribution of the sample mean, the Central Limit Theorem states that as the sample size gets bigger, the standardised version of the sample mean approximates to a normal distribution. This means that we do not need to know about the distribution we are sampling from in order to know about the behaviour of the sample mean. (And I agree with Sara that this is pretty cool).
Statistics is the study of estimation and inference, where estimation refers to estimating the parameters that govern an observed stochastic process or phenomenon which we know or assume follow a certain distribution.
An estimator is the function of a random sample, while an estimate is the realisation of the random sample.