Sunday, April 9, 2017

F is for Ronald Fisher

I fell a bit behind in my schedule. Friday was spent volunteering for my choir and attending a benefit, and Saturday was spent recovering from the night before and attending a movie night with friends. But I'll have some great videos to share of the event soon!

One reason I didn't write my post on Friday was because the topic I was originally going to write about (factorials) I covered last year. So I needed to make time to sit down and think of a topic, then write it as well, and that sadly didn't happen. But I remembered many of the contributions of Ronald Fisher, and decided it might be nice to feature a statistician in one of my posts.

By Flikr commons -, CC BY 2.0, Link

Fisher was an English statistician and biologist. Many of his contributions are to evolutionary theory, genetics, and related topics, but he also gave us many statistics and methods. Probably the best known contribution is the Analysis of Variance (ANOVA). I'll get into the mechanics of statistical analysis later, which will help you understand what's going on when we run an ANOVA. But this extremely useful test lets you compare means from more than 2 groups.

Using my caffeine study example, I could have 3 groups instead of two: control (no caffeine), experimental 1 (receives 95 mg of caffeine, the equivalent of one cup of coffee), and experimental 2 (receives 190 mg, 2 cups of coffee). I could then compare the test scores across the 3 groups. This would let me explore many possible outcomes, and in fact, I could add a few more experimental groups, where I give different amounts of caffeine. It could be that more caffeine is better. Or it could improve performance up to a point, and then flatten out. We could even find a point where more caffeine is actually harmful. ANOVA is great when you have multiple (but not a huge number of) study groups, each of which receives a specific intervention or treatment. If you have more than a handful of groups, there are other better analyses you could use.

Even though Fisher named this analysis after what it does (it analyzes variance and determines what variance is due to the treatment and what variance is error), he did sort of name the test after himself. The statistic ANOVA produces is F (for Fisher), and it is compared to the F distribution critical value to determine if the test is significant (there is a real difference, larger than what we would expect by chance alone). You might even hear people refer to the F test - they're talking about ANOVA.

He also gave us the Fisher exact test, which I use less often than ANOVA but have still used. It lets test whether two proportions are equal to each other. At my old job, we used Fisher exact tests quite a bit to compare our sample characteristics to population characteristics. That is, we would do surveys of VA patients, and then make sure our sample represented that population well in terms of gender, race, ethnicity, and so on. The assumption is that, if the sample is similar to the population in characteristics, then it should provide similar answers to the survey questions (the topic of study) as the population would have.

Fisher also gave us maximum likelihood estimation (MLE), which is getting into more advanced statistical topics. I'll be honest - I understand it, but not well enough to actually teach it to someone. But I knew I would be remiss if I didn't at least mention MLE.

Stay tuned for the letter G later today!

No comments:

Post a Comment