Saturday, April 29, 2017

Y is for Y (Dependent Variables)

Just as x signifies the independent or predictor variable, y signifies the dependent variable. This is the outcome you're measuring - what you believe the independent variable causes or affects. That is, what this variable looks like depends on the independent variable.

In our ongoing caffeine study, the dependent variable is test performance. For the dataset I made available here, the column labeled "score" is the dependent variable - score on the fictional test. The column labeled "group" contains the independent variable - whether that person was assigned to receive caffeinated (1) or decaffeinated (0) coffee.

One thing that is very important in establishing cause and effect is temporal order. Even if you can't manipulate the independent variable, you need to at least show the independent variable happened first and the dependent variable happened (sometime) after. While after doesn't necessary mean because of (a fallacy we call post hoc, ergo propter hoc, or "after this, therefore because of this"), if the independent variable didn't happen first, there's no way it could have caused the dependent variable. (So it's a necessary, but not sufficient, condition.)

So in conclusion:

1 comment:

  1. Correlation doesn't mean causation. That's what I always say when I see that fallacy.
    Great post on the dependent variable!

    Y is for You Can Do It