Last week, in part 1
, I talked about degrees of freedom as the number of values that are free to vary. This is where the name comes from, of course, and this is still true in part 2, but there’s more to it than that, which I’ll talk about today.
In the part 1 example, I talked about why degrees of freedom for a t-test
is smaller than sample size – 2 fewer to be exact. It’s because all but the last value in each group is free to vary. Once you get to that last value in determining the group mean, that value is now determined – from a statistical standpoint, that is. But that’s not all there is to it. If that was it, we wouldn’t really need a concept of degrees of freedom. We could just set up the table of t critical values by sample size instead of degrees of freedom.
And in fact, I’ve seen that suggested before. It could work in simple cases, but as many statisticians can tell you, real datasets are messy, rarely simple, and often require more complex approaches. So instead, we teach concepts that become relevant in complex cases using simple cases. A good way to get your feet wet, yes, but perhaps a poor demonstration of why these concepts are important. And confusion about these concepts - even among statistics professors - remains, because some of these concepts just aren't intuitive.
Degrees of freedom can be thought as the number of independent values that can be used for estimation.
Statistics is all about estimation, and as statistics become more and more complex, the estimation process also becomes more complex. Doing all that estimating requires some inputs. The number of inputs places a limit on how many things we can estimate, our outputs. That’s what your degrees of freedom tells you – it’s how many things you can estimate (output) based on the amount of data you have to work with (input). It keeps us from double-dipping - you can't reuse the same information to estimate a different value. Instead, you have to slice up the data in a different way.
Degrees of freedom measures the statistical fuel available for the analysis.
For analyses like a t-test, we don’t need to be too concerned with degrees of freedom. Sure, it costs us 1 degree of freedom for each group mean we calculate, but as long as we have a reasonable sample size, those 2 degrees of freedom we lose won't cause us much worry. We need to know degrees of freedom, of course, so we know which row to check in our table of critical values – but even that has become an unnecessary step thanks to computer analysis. Even when you’re doing a different t-test approach that alters your degrees of freedom (like Welch’s t, which is used when the variances between your two groups aren’t equal – more on that test later, though I've mentioned it once before
), it’s not something statisticians really pay attention to.
But when we start adding in more variables, we see our degrees of freedom decrease as we begin using those degrees of freedom to estimate values. We start using up our statistical fuel.
And if you venture into even more complex approaches, like structural equation modeling
(one of my favorites), you’ll notice your degrees of freedom can get used up very quickly – in part because your input for SEM is not the individual data but a matrix derived from the data (specifically a covariance matrix, which I should also blog about sometime). That was the first time I remember being in a situation where my degrees of freedom didn't seem limitless, where I had to simplify my analysis because I had used up all my degrees of freedom, and not just once. Even very simple models could be impossible to estimate based on the available degrees of freedom. I learned that degrees of freedom isn’t just some random value that comes along with my analysis.
It’s a measure of resources for estimation and those resources are limited.
For my fellow SEM nerds, I might have to start referring to saturated models – models where you’ve used up every degree of freedom – as “out of gas.”
Perhaps the best way to demonstrate degrees of freedom as statistical fuel is by showing how degrees of freedom are calculated for the analysis of variance (ANOVA)
. In fact, it was Ronald Fisher
who came up with both the concept of degrees of freedom and the ANOVA (and the independent samples t-test referenced in part 1 and again above). Fisher also came up with the correct way to determine degrees of freedom for Pearson’s chi-square
– much to the chagrin of Karl Pearson, who was using the wrong degrees of freedom for his own test.
First, remember that in ANOVA, we’re comparing our values to the grand mean (the overall mean of everyone in the sample, regardless of which group they fall in). Under the null hypothesis
, this is our expected value
for all groups in our analysis. That by itself uses 1 degree of freedom – the last value is no longer free to vary, as discussed in part 1 and reiterated above. (Alternatively, you could think of it as spending 1 degree of freedom to calculate that grand mean.) So our total degrees of freedom for ANOVA is N-1. That's always going to be our starting point. Now, we take that quantity and start partitioning it out to each part of our analysis.
Next, remember that in ANOVA, we’re looking for effects by partitioning variance – variance due to group differences (our between groups
effect) and variance due to chance or error (our within group
differences). Our degrees of freedom for looking at the between group effect is determined by how many groups we have, usually called k
, minus 1.
Let’s revisit the movie theatre example from the ANOVA post.
Review all the specifics here
, but the TL;DR is that you're at the movie theatre with 3 friends who argue about where to sit in the theatre: front, middle, or back. You offer to do a survey of people in these different locations to see which group best enjoyed the movie, because you're that kind of nerd.
If we want to find out who had the best movie-going experience of people sitting in the front, middle, or back of the theatre, we would use a one-way ANOVA comparing 3 groups. If k
is 3, our between groups degrees of freedom is 2. (We only need two because we have the grand mean, and if we have two of the three group means - the between groups effect - we can figure out that third value.)
We subtract those 2 degrees of freedom from our total degrees of freedom. If we don’t have another variable we’re testing – another between groups effect – the remaining degrees of freedom can all go toward estimating within group differences (error). We want our error degrees of freedom to be large, because we take the total variance and divide it by the within group degrees of freedom. The more degrees of freedom we have here, the smaller our error, meaning our statistic is more likely to be significant.
But what if we had another variable? What if, in addition to testing the effect of seat location (front, middle, or back), we also decided to test the effect of gender? We could even test an interaction between seat location and gender to see if men and women have different preferences on where to sit in the theatre. We can do that, but adding those estimates in is going to cost us more degrees of freedom. We can't take any degrees of freedom from the seat location analysis - they're already spoken for. So we take more degrees of freedom from the leftover that goes toward error.
For gender, where k
equals 2, we would need 1 degree of freedom. And for the interaction, seat location X gender, we would multiply the seat location degrees of freedom by the gender degrees of freedom, so we need 2 more degrees of freedom to estimate that effect. Whatever is left goes in the error estimate. Sure, our leftover degrees of freedom is smaller than it was before we added the new variables, but the error variance is also probably smaller. We’re paying for it with degrees of freedom, but we’re also moving more variance from the error row to the systematic row.
This is part of the trade-off we have to make in analyzing data – trade-offs between simplicity and explaining as much variance as possible. In this regard, degrees of freedom can become a reminder of that trade-off in action: what you’re using to run your planned analysis.
It's all fun and games until someone runs out of degrees of freedom.