Sunday, April 1, 2018

A is for (Cronbach's) Alpha

Title You may recall that last April, I blogged the A to Z of Statistics, and the very first post was alpha, or the type I error rate. This year, I'm blogging the A to Z of R, and this post is about a different kind of alpha - Cronbach's alpha, a measure of reliability, specifically internal consistency. You can find more detail about Cronbach's alpha conceptually by checking out today's Statistics Sunday post. This post focuses on how to compute Cronbach's alpha in R.

You can find the alpha function in the psych library - a package I'll be blogging about more later this month. Install if you haven't yet and always load the library before you get started (even if you just installed it):

install.packages("psych")
library(psych)

To compute alpha, you'll need a set of items (at least 3) that measure the same concept. Determining if items measure the same concept is a validity issue, but it doesn't really make sense to examine reliability when you have no evidence that the items measure the same thing. For my example throughout much of this month, I'll use the dataset from my Facebook study, since the dataset includes 6 published measures that have previous evidence of reliability and validity. I created a "small" version of the data - one that only includes ID, gender, and responses to the 7 measures.

Because this is only for demonstration purposes, I don't really want to share my original dataset - I'd rather, if people are interested in a copy, that they reach out to me and request it officially - but I did create a simulated dataset, which has similar means and correlations as the original data, and is formatted in the same way. You can download that dataset here to analyze along with me, as well as a mini codebook here, but note that your values will differ slightly. Make sure that when you save it to your computer that you either put it in your R working directory or that you change your working directory to the folder you saved it in.

First, I want to read in the data:

Facebook<-read.delim("small_facebook_set.txt", header=TRUE)

I can start examining my individual measures in the data. To help with analysis, I noted which columns corresponded to each measure, and which items are reversed:
  • Ruminative Response Scale, columns 3-24
  • Savoring Beliefs Inventory, columns 25-48, even-number items are reversed (e.g., 2nd item, 4th item)
  • Satisfaction with Life Scale, columns 49-53
  • Ten-Item Personality Measure, columns 54-63
  • Cohen-Hoberman Inventory of Physical Symptoms, columns 64-95
  • Center for Epidemiologic Studies Depression Scale, columns 96-111, items 4, 8, and 12 are reversed
If you downloaded the simulated dataset referenced above, that file has the same columns and reversed items.

The code for Cronbach's alpha is pretty simple; you include the range of columns that are part of the measure and which items are reverse-scored with the keys statement. For instance:

Savoring<-alpha(Facebook[,25:48], keys=c(seq(from = 2, to = 24, by = 2)))

That command runs Cronbach's alpha on that range of columns - the 24 items of the Savoring Beliefs Inventory - and reverse scores the even-numbered items. The seq command produces the same result as if I had typed keys=c(2,4,6,8,10,12,14,16,18,20,24). I've now created an R object called Savoring that contains the results of Cronbach's alpha. Requesting a summary of that object will give me the raw and standardized alpha, another measure of Cronbach's alpha, the average correlation between items, and a few other descriptive statistics:

summary(Savoring)
Reliability analysis   
 raw_alpha std.alpha G6(smc) average_r S/N    ase mean   sd
      0.93      0.93    0.94      0.35  13 0.0066  5.5 0.82

I can get more detailed output (check out a sample here) by requesting the full object (Savoring, without the summary part), which tells me what would happen to alpha if I dropped a particular item and provides statistics for each item (e.g., mean and standard deviation, proportion selecting each response option, and so on).

If you're analyzing a scale with no reversed items, you'd just drop the keys statement:

Rumination<-alpha(Facebook[,3:24])

Not long ago, I blogged that I had misplaced my study codebook, which, among other things, identified which items are reversed. Fortunately, the alpha function has an option that could help if I had been unable to recover that information. The check.keys statement will conduct a principal components analysis of the scale and identify items with negative loadings - that is, items that correlate negatively with the other items and therefore should be reversed. The function will then reverse those items prior to running Cronbach's alpha and provide a warning in red text that lets you know items were reversed:

Savoring_check<-alpha(Facebook[,25:48], check.keys=TRUE)

Another useful feature is delete, which, if set to true, will drop items with no variance (i.e., everyone responded to an item in the exact same way). This is less likely to occur in 1) established measures, where meaningless items have already been weeded out, and 2) measures using rating scales. For instance, when I assisted with psychometric analysis of a new measure of adverse outcomes to opioid prescriptions, we weeded out a few items that had no variance - but then, this was a brand new measure that was essentially a checklist (used a 0, not present and 1, present scale). If one or both of these apply to the measure you're working on, you might want to consider adding delete:

Savoring_drop<-alpha(Facebook[,25:48], delete=TRUE)

As with check.keys, you'll receive a warning.

You can read the document of the alpha function in psych here. There are many other options you can specify, such as number of iterations if you want to bootstrap your confidence intervals, or number of observations if you have a correlation matrix instead of raw data. What I've provided you above are the ones that you're most likely to need when examining alpha in a dataset.


No comments:

Post a Comment