Deeply Trivial: Complex Models and Control Files: From the Desk of a Psychometrician

Monday, October 9, 2017

Complex Models and Control Files: From the Desk of a Psychometrician

We're getting ready to send out a job analysis survey, part of our content validation study. In the meantime, I'm working on preparing control files to analyze the data when we get it back. I won't be running the analysis for a couple weeks, but the model I'll be using is complex enough (in part because I added in some nerdy research questions to help determine best practices for these types of surveys), I decided to start thinking about it now.

I realize there's a lot of information to unpack in that first paragraph. Without going into too much detail, here's a bit of background. We analyze survey data using the Rasch model. This model assumes that an individual's response to an item is a function of his/her ability level and the difficulty level of the item itself. For this kind of measure, where we're asking people to rate items on a scale, we're not measuring ability; rather, we're measuring a trait - an individual's proclivity toward a job task. In this arrangement, items are not difficult/easy but more common/less common, or more important/less important, and so on. The analysis gives us probabilities that people at different ability (trait) levels will respond to an item in a certain way:

It's common for job analysis surveys to use multiple rating scales on the same set of items, such as having respondents go through and rate items on how frequently they perform them, and then go through again and rate how important it is to complete a task correctly. For this kind of model, we use a Rasch Facets model. A facet is something that affects responses to an item. Technically, any Rasch model is a facets model; in a basic Rasch model, there are two facets: respondents (and their ability/trait level) and items. When you're using multiple rating scales, scale is a facet.

And because I'm a nerd, I decided to add another facet: rating scale order. The reason we have people rate with one scale then go through and rate with the second (instead of seeing both at once) is so that people are less likely anchor responses on one scale to responses on another scale. That is, if I rate an item as very frequent, I might also view it as more important when viewing both scales than I would have had I used the scales in isolation. But I wonder if there still might be some anchoring effects. So I decided to counterbalance. Half of respondents will get one scale first, and the other half will get the other scale first. I can analyze this facet to see if it affected responses.

This means we have 4 facets, right? Items, respondents, rating scale, and order. Well, here's the more complex part. We have two different versions of the frequency scale: one for tasks that happen a lot (and assess daily frequency) and one for less common tasks (that assess weekly/monthly frequency). All items use the same importance scale. The two frequency scales have the same number of categories, but because we may need to collapse categories during the analysis phase, it's possible that we'll end up with two very different scales. So I need to factor in that, for one scale, half of items share one common response structure and the other half share the other common response structure, but for the other scale, all items share a common response structure.

I'm working on figuring out how to express that in the control file, which is a text file used by Rasch software to describe all the aspects of the model and analysis. It's similar to any syntax file for statistics software: there's a specific format needed for the program to read the file and run the specified analysis. I've spent the morning digging through help files and articles, and I think I'm getting closer to having a complete control file that should run the analysis I need.

3 comments:

UnknownOctober 10, 2017 at 9:00 AM
This seems very clever to me. But I have to admit I'll need to study it more to fully understand it. This reminds me of ROC signal detection...yes?

One thought I had was whether analyses done with this level of sophistication produce results different from a more basic analysis.
ReplyDelete
Replies
UnknownOctober 10, 2017 at 9:07 AM
I did a counterbalancing of scales here, to great effect.

I remember being taught in the pre-desktop computer age to always counterbalance. That seems a rare methodological control these days.

Scotti, J. R., Slack, B. S., Bowman, R. A. and Morris, T. L. 1996. College students' attitudes concerning the sexuality of persons with mental retardation: Development of the Perceptions of Sexuality Scale.. Sexuality and Disability, 14: 249–63. [Crossref], [Web of Science ®]
, [Google Scholar]
ReplyDelete
Replies

Add comment