Sunday, March 24, 2019

Statistics Sunday: Blogging A to Z Theme Reveal

I'm excited to announce this year's theme for the Blogging A to Z challenge:


I'll be writing through the alphabet of psychometrics with the Rasch Measurement Model approach. I've written a bit about Rasch previously. You can find those posts here:

Looking forward to sharing these posts! First up is A for Ability on Monday, April 1!

Sunday, March 17, 2019

Statistics Sunday: Standardized Tests in Light of Public Scandal

No doubt, by now, you've heard about the large-scale investigation into college admissions scandals among the wealthy - a scandal that suggests SAT scores, among other things, can in essence be bought. Eliza Shapiro and Dana Goldstein of the NY Times ask if this scandal is "the last straw" for tests like the SAT.

To clarify in advance, I do not nor have I ever worked for the Educational Testing Service or for any organization involved in admissions testing. But as a psychometrician, I have a vested interest in this industry. And I became a psychometrician because of my philosophy: that many things, including ability, achievement, and college preparedness, can be objectively measured if certain procedures and methods are followed. If the methods and procedures are not followed properly in a particular case, the measurement in that case is invalid. That is what happens when a student (or more likely, their parent) pays someone else to take the SAT for them, or bribes a proctor, or finds an "expert" willing to sign off on a disability the student does not have to get extra accommodations.

But because that particular instance of measurement is invalid doesn't damn the entire field to invalidity. It just means we have to work harder. Better vetting of proctors, advances in testing like computerized adaptive testing and new item types... all of this is to help counteract outside variables that threaten the validity of measurement. And expansions in the field of data forensics now include examining anomalous patterns in testing, to identify if some form of dishonesty has taken place - allowing scores to be rescinded or otherwise declared invalid after the fact.

This is a field I feel strongly about, and as I said, really sums up my philosophy in life for the value of measurement. Today, I'm on my way to the Association of Test Publishers Innovations in Testing 2019 meeting in Orlando. I'm certain this recent scandal will be a frequent topic at the conference, and a rallying cry for better protection of exam material and better methods for identifying suspicious testing behavior. Public trust in our field is on the line. It is our job to regain that trust.


Tuesday, March 12, 2019

Are Likert Scales Superior to Yes/No? Maybe

I stumbled upon this great post from the Personality Interest Group and Espresso (PIG-E) blog about which is better - Likert scales (such as those 5-point Agree to Disagree scales you often see) or Yes/No (see also True/False)? First, they polled people on Twitter. 66% of respondents thought that going from a 7-point to 2-point scale would decrease reliability on a Big Five personality measure; 71% thought that move would decrease validity. But then things got interesting:
Before I could dig into my data vault, M. Brent Donnellan (MBD) popped up on the twitter thread and forwarded amazingly ideal data for putting the scale option question to the test. He’d collected a lot of data varying the number of scale options from 7 points all the way down to 2 points using the BFI2. He also asked a few questions that could be used as interesting criterion-related validity tests including gender, self-esteem, life satisfaction and age. The sample consisted of folks from a Qualtrics panel with approximately 215 people per group.

Here are the average internal consistencies (i.e., coefficient alphas) for 2-point (Agree/Disagree), 3-point, 5-point, and 7-point scales:

And here's what they found in terms of validity evidence - the correlation between the BFI2 and another Big Five measure, the Mini-IPIP:


FYI, when I'm examining item independence in scales I'm creating or supporting, I often use 0.7 as a cut-off - that is, items that correlate at 0.7 or higher (meaning 49% shared variance) are essentially measuring the same thing and violate the assumption of independence. The fact that all but Agreeableness correlates at or above 0.7 is pretty strong evidence that the scales, regardless of number of response options, are measuring the same thing.

The post includes a discussion of these issues by personality researchers, and includes some interesting information not just on number of response options, but also on the Big Five personality traits.

Monday, March 11, 2019

Statistics Sunday: Scatterplots and Correlations with ggpairs

As I conduct some analysis for a content validation study, I wanted to quickly blog about a fun plot I discovered today: ggpairs, which displays scatterplots and correlations in a grid for a set of variables.

To demonstrate, I'll return to my Facebook dataset, which I used for some of last year's R analysis demonstrations. You can find the dataset, a minicodebook, and code on importing into R here. Then use the code from this post to compute the following variables: RRS, CESD, Extraversion, Agree, Consc, EmoSt, Openness. These correspond to measures of rumination, depression, and the Big Five personality traits. We could easily request correlations for these 7 variables. But if I wanted scatterplots plus correlations for all 7, I can easily request it with ggpairs then listing out the columns from my dataset I want included on the plot:

library(ggplot2)
ggpairs(Facebook[,c(112,116,122:126)]

(Note: I also computed the 3 RRS subscales, which is why the column numbers above skip from 112 (RRS) to 116 (CESD). You might need to adjust the column numbers when you run the analysis yourself.)

The results look like this:


Since the grid is the number of variables squared, I wouldn't recommend this type of plot for a large number of variables.

Thursday, March 7, 2019

Time to Blog More

My blogging has been pretty much non-existent this year. Without getting too personal, I've been going through some pretty major life changes, and it's been difficult to focus on a variety of things, especially writing. As I work through this big transition, I'm thinking about what things I want to make time for and what things I should step away from.

Writing - especially about science, statistics, and psychometrics - remains very important to me. So I'm going to keep working to get back into some good blogging habits. Statistics Sunday posts may remain sporadic for a bit longer, but look for more statistics-themed posts very soon because...


That's right, it's time to sign up for the April A to Z blogging challenge! I'll officially announce my theme later this month, but for now I promise it will be stats-related.