In a previous post this month, I talked about the four types of variables (nominal, ordinal, interval, and ratio) and dug into the interval classification. I said I mostly work with interval variables but that there was more to it than that. And that brings us to today's post. Because I also work with ordinal variables, and part of what I do in my job involves transforming ordinal variables into interval variables.
Most statistical analyses require interval or ratio variables. There are options for nominal and ordinal variables but they tend to be more limited, and the analyses you learn about in an introductory statistics course will mostly focus on analyses for interval/ratio data. Remember that the key difference between ordinal and interval data is equal intervals (that is, the difference between 1 and 2 is the same as the difference between 2 and 3 for an interval variable but not necessarily for an ordinal variable). Ordinal variables, on the other hand, are ranks.
You would think this difference would be straightforward and that everyone would agree on what is ordinal and what is interval. But you'd be wrong, because this is an ongoing point of contention in my field.
In my job, I work with test data - achievement tests, cognitive ability tests, language surveys, and so on. A raw score on the test is the number of items a person answered correctly. So that should be an interval variable, right? Not necessarily. It depends on what test theory you ascribe to.
In the beginning, there was classical test theory. It focused on the test as a whole unit, with overall reliability (i.e., consistency) and validity (i.e., measures what it's supposed to measure). The raw score was the sum of a person's true ability (what you're trying to measure) and error (there it is again). And test developers thought that it was good.
But then along came other approaches, such as item response theory (IRT) and the Rasch measurement model. The developers of these approaches argued that individual items are not necessarily equal. Some are more difficult than others. Some are biased. Some might not measure what you think it does. Some provide redundant information. We should focus on individual items as much as the test as a whole. In these approaches, a person's response on an item is caused by the difficulty of the item and the test-taker's underlying ability. Because not all items are created equal, we don't have equal interval data. A raw score is not an interval variable; it's ordinal. But IRT and Rasch models transform raw scores into an equal interval variable, by taking into account item difficulty. As long as you use the scale score, you have equal interval data.
See what I mean?
No comments:
Post a Comment