Friday, April 5, 2019

E is for Equating

In the course of any exam, new items and even new forms have to be written. The knowledge base changes and new topics become essential. How do we make sure these new items contribute to the overall test score? Through a process called equating.

Typically in Rasch, we equate test items by sprinkling new items in with old items. When we run our Rasch analysis on items and people, we "anchor" the old items to their already established item difficulties. When item difficulties are calculated for the new items, they are now on the exact same metric as the previous items, and new difficulties are established relative to the ones that have already been set through pretesting.

It's not even necessary for everyone to receive all pretest items - or even all of the old items. You just need enough overlap to create links between old and new items. In fact, when you run data from a computer adaptive test, there are a lot "holes" in the data, creating a sparse matrix.


In the example above, few examinees received the exact same combination of items, but with an entire dataset that looks like this (and more examinees, of course), we could estimate item difficulties for new items in the set.

But you may also ask about the "old" items - do we always stick with the same difficulties or do we change those from time to time? Whenever you anchor items to existing item difficulties, the program will still estimate fresh item difficulties and let you examine something called "displacement": how much the newly estimated difficulty differs from the anchored one. You want to look at these and make sure you're not experiencing what's called "item drift," which happens when an item becomes easier or harder over time.

This definitely happens. Knowledge that might have previously been considered difficult can become easier over time. Math is a great example. Many of the math classes my mom took in high school and college (and that may have been electives) were offered to me in middle school or junior high (and were mandatory). Advances in teaching, as well as better understanding of how these concepts work, can make certain concepts easier.

On the flipside, some items could get more difficult over time. My mom was required to take Latin in her college studies. I know a little "church Latin" but little else. Items about Latin would have been easier for my mom's generation, when the language was still taught, than mine. And if we could hop in a time machine and go back to when scientific writing was in Latin (and therefore, to be a scientist, you had to speak/write Latin fluently), these items would be even easier for people.

Essentially, even though Rasch offers an objective method of measurement and a way to quantify item difficulty, difficulty is still a relative concept and can change over time. Equating is just one part of the type of testing you must regularly do with your items to maintain a test or measure.

No comments:

Post a Comment