Tuesday, April 11, 2017

I is for Interval Variables

Research is all about working with variables, and looking for things that explain, maybe even predict, why these values vary. We use statistics to examine these explanations and/or predictors. There are a variety of statistical tests and new ones are always being invented, not just because computers have gotten better and more widely available (so we can do complex analyses quickly and without the need to check out space for a mainframe computer), but also because the types of data we might want to analyze have changed.

Regardless, variables can be classified as one of four types. And even as new types of data become available - social network connections on Facebook, data from brain scans, etc. - we can still classify these variables into one of four types:
• Nominal
• Ordinal
• Interval
• Ratio
If you like mnemonics, you can remember them with NOIR (whether you choose to think of that as the French word for black or the name of a fiction genre is totally up to you). Nominal variables are categories, like gender or race - you may assign numbers to them for coding purposes but they don't have any numeric meaning. Assigning a particular group as the first category in your coding scheme doesn't suddenly make that category better, because the numbers you assign would be completely arbitrary.

Next are ordinal variables, variables where the numbers have meaning, but are simply ranks. The difference between two concurrent ranks may not be equal to the difference between two different concurrent ranks. Let me explain with an example: if you're running a race and you have the fastest time, you come in 1st place. The person who came in 2nd might have been right on your heels, while the person who came in 3rd may have been seconds behind 2nd place. The order has meaning, but the difference between what it took you to get 1st over 2nd place is not necessarily equal to the difference between what it took 2nd place to beat 3rd place. Even though the variable involves numbers, it really doesn't make sense to report a mean or standard deviation. I'll talk more about ordinal variables this month, because it's actually a point of contention in my field especially.

Last are interval and ratio variables. For all intents and purposes, these variables are treated the same way in analysis. They are continuous (meaning it makes sense to report things like a mean and standard deviation) and equal interval (that is, the difference between 1 and 2 is the same as the difference between 2 and 3). The way they differ is in whether they have a meaningful 0 value.

A ratio variable is one where 0 reflects an absence of something. If you have an empty scale, it would register weight as 0; there is an absence of weight. When you have a meaningful 0, you can also create ratios; for instance, you can say that a person who weights 200 pounds weighs twice as much as a person who weighs 100 pounds.

Interval variables, on the other hand, may have a 0 value but it doesn't mean an absence of something. For instance, a temperature of 0° C doesn't reflect an absence of weather. It's simply a point on the scale. If you've taken a statistics and/or research methods course, you've probably spent a lot of time working with interval variables. And if you've ever participated in research - especially social scientific research - you've probably provided interval data. Psychological measures often provide data that is continuous but doesn't have a true 0 value. Even variables that look like they have a meaningful 0 value may not in practice.

For example, you may have received questionnaires with a "neutral" in the middle. Sometimes, they might include 0 above the descriptor, and the options on either side get positive or negative values. You might think neutral means an absence of attitude about the topic. But think about how you use that middle option - is it always absence? Sometimes it's indecision - you can't choose whether your attitude is positive or negative, and maybe it's a little of both, so you choose 0. Maybe you didn't understand the question (there are lots of bad survey questions, after all), but you didn't want to leave it blank. That 0 value can take on different meaning depending on who is responding to the survey.

At my job, we work most often with interval data. There's more to it than that, which I'll explain later. Stay tuned!