Deeply Trivial: higher education

Showing posts with label higher education. Show all posts

Tuesday, April 17, 2018

O is for Overview Reports

O is for overview Reports with dataMaid One of the best things you can do is to create a study codebook to accompany your dataset. In it, you should include information about the study variables and how they were created/computed. It's also nice to have a summary of your dataset, all in one place, so you can quickly check for any issues and begin cleaning your data, and/or plan your analysis. But this can be a rather tedious process of creating and formatting said report, and running the various descriptive statistics and plots. But what if an R package could do much of that for you?

dataMaid to the rescue! In addition to having some great functions to help streamline data cleaning, dataMaid can create an overview report of your dataset, containing the information you request, and generate an R Markdown file to which you could add descriptive information, like functions used to calculate variables, item text, and so on.

For today's post, I'll use the simulated Facebook dataset I created and shared. You can replicate the exact results I get if you use that dataset. After importing the file, I want to make certain all variables are of the correct type. If necessary, I can make some changes. This becomes important when we go to generate our report. I also want to score all my scales, so I have total and subscale scores in the file.

Facebook<-read.delim(file="simulated_facebook_set.txt", header=TRUE)
str(Facebook)

## 'data.frame': 257 obs. of  111 variables:
##  $ ID            : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ gender        : int  1 1 1 0 1 0 1 0 1 1 ...
##  $ Rum1          : int  3 2 3 2 2 2 0 4 1 0 ...
##  $ Rum2          : int  1 2 4 2 4 1 1 2 2 2 ...
##  $ Rum3          : int  3 3 2 2 2 2 0 1 2 2 ...
##  $ Rum4          : int  1 2 4 0 2 4 1 0 2 2 ...
##  $ Rum5          : int  3 1 2 2 1 3 1 0 0 2 ...
##  $ Rum6          : int  2 3 4 3 2 2 1 1 3 2 ...
##  $ Rum7          : int  3 1 4 3 0 3 3 4 3 2 ...
##  $ Rum8          : int  1 2 4 1 0 1 1 1 3 3 ...
##  $ Rum9          : int  3 0 2 0 1 0 3 2 2 0 ...
##  $ Rum10         : int  1 1 2 2 2 2 2 0 1 1 ...
##  $ Rum11         : int  1 0 0 3 1 2 3 0 4 3 ...
##  $ Rum12         : int  0 2 2 1 0 1 0 2 2 0 ...
##  $ Rum13         : int  4 2 3 3 3 2 1 1 2 2 ...
##  $ Rum14         : int  0 1 3 1 2 2 2 2 4 2 ...
##  $ Rum15         : int  2 2 1 2 2 2 2 1 3 0 ...
##  $ Rum16         : int  2 4 4 0 1 2 0 1 2 4 ...
##  $ Rum17         : int  1 2 2 2 1 3 2 1 2 3 ...
##  $ Rum18         : int  2 2 4 1 2 2 2 1 1 1 ...
##  $ Rum19         : int  0 2 2 1 2 4 2 2 1 0 ...
##  $ Rum20         : int  1 1 2 2 1 1 1 2 4 2 ...
##  $ Rum21         : int  2 2 1 1 1 1 1 3 4 0 ...
##  $ Rum22         : int  2 1 2 2 1 2 0 1 1 1 ...
##  $ Sav1          : int  5 6 7 4 5 6 6 7 6 7 ...
##  $ Sav2          : int  3 2 6 2 2 6 3 6 3 2 ...
##  $ Sav3          : int  7 7 7 6 7 5 6 6 7 7 ...
##  $ Sav4          : int  4 5 5 4 3 5 1 5 2 5 ...
##  $ Sav5          : int  7 6 7 7 5 6 6 6 4 6 ...
##  $ Sav6          : int  4 0 6 4 2 2 3 4 6 4 ...
##  $ Sav7          : int  3 6 5 6 7 7 7 7 7 6 ...
##  $ Sav8          : int  2 2 3 3 2 4 3 3 3 5 ...
##  $ Sav9          : int  6 4 6 6 6 6 6 7 6 5 ...
##  $ Sav10         : int  2 4 1 2 1 3 2 5 1 1 ...
##  $ Sav11         : int  3 3 6 2 6 6 4 1 3 4 ...
##  $ Sav12         : int  0 3 3 3 4 4 3 4 5 3 ...
##  $ Sav13         : int  3 7 7 4 4 3 5 5 7 4 ...
##  $ Sav14         : int  2 2 5 0 3 2 2 2 3 2 ...
##  $ Sav15         : int  5 6 5 5 4 7 4 6 7 7 ...
##  $ Sav16         : int  3 2 2 6 2 3 1 3 2 2 ...
##  $ Sav17         : int  6 3 6 6 5 4 6 6 6 5 ...
##  $ Sav18         : int  2 2 2 3 2 6 3 2 1 2 ...
##  $ Sav19         : int  6 7 6 6 6 7 2 4 6 4 ...
##  $ Sav20         : int  3 2 3 4 6 6 6 3 3 7 ...
##  $ Sav21         : int  6 3 3 6 4 7 6 6 7 4 ...
##  $ Sav22         : int  1 4 2 2 2 2 2 2 3 3 ...
##  $ Sav23         : int  7 7 6 4 6 7 6 4 6 5 ...
##  $ Sav24         : int  2 1 5 1 1 1 3 1 2 2 ...
##  $ LS1           : int  3 5 6 4 7 4 4 6 6 7 ...
##  $ LS2           : int  6 2 6 5 7 5 6 5 6 4 ...
##  $ LS3           : int  7 4 6 4 3 6 5 2 6 6 ...
##  $ LS4           : int  2 6 6 3 6 6 5 6 7 5 ...
##  $ LS5           : int  3 6 7 5 4 4 4 1 4 2 ...
##  $ Extraverted   : int  5 4 6 6 3 3 7 4 4 6 ...
##  $ Critical      : int  5 5 5 3 4 5 1 6 6 5 ...
##  $ Dependable    : int  6 6 6 5 6 6 6 6 7 7 ...
##  $ Anxious       : int  6 6 6 4 6 6 5 6 5 4 ...
##  $ NewExperiences: int  7 6 6 6 6 6 7 6 3 6 ...
##  $ Reserved      : int  3 4 6 5 5 5 3 4 7 2 ...
##  $ Sympathetic   : int  7 6 7 6 5 6 3 7 6 6 ...
##  $ Disorganized  : int  6 5 6 5 5 5 3 4 7 5 ...
##  $ Calm          : int  6 7 6 5 6 7 6 3 5 6 ...
##  $ Conventional  : int  3 4 2 3 2 2 2 2 3 3 ...
##  $ Health1       : int  1 1 4 2 1 3 2 3 3 0 ...
##  $ Health2       : int  2 1 1 0 2 1 0 2 2 1 ...
##  $ Health3       : int  2 1 2 0 1 2 3 2 2 1 ...
##  $ Health4       : int  0 2 0 0 1 0 0 1 0 0 ...
##  $ Health5       : int  1 1 1 0 1 1 1 0 3 0 ...
##  $ Health6       : int  0 0 2 0 0 0 1 0 1 0 ...
##  $ Health7       : int  0 0 3 2 0 1 1 2 1 0 ...
##  $ Health8       : int  2 3 4 2 1 1 1 0 2 2 ...
##  $ Health9       : int  2 3 3 1 2 4 4 2 2 3 ...
##  $ Health10      : int  0 0 1 1 0 1 1 1 0 0 ...
##  $ Health11      : int  0 1 2 1 2 0 0 0 2 0 ...
##  $ Health12      : int  0 2 1 2 0 1 1 0 0 0 ...
##  $ Health13      : int  0 2 2 0 2 3 1 1 0 1 ...
##  $ Health14      : int  2 2 3 0 0 2 1 2 2 0 ...
##  $ Health15      : int  2 1 1 0 1 0 0 0 1 0 ...
##  $ Health16      : int  1 1 3 0 2 3 2 1 0 0 ...
##  $ Health17      : int  0 0 0 2 2 3 0 2 1 0 ...
##  $ Health18      : int  1 4 1 1 0 0 0 1 2 0 ...
##  $ Health19      : int  0 2 2 0 0 1 0 0 2 1 ...
##  $ Health20      : int  2 1 2 0 1 1 0 0 0 0 ...
##  $ Health21      : int  1 0 1 1 0 2 0 1 1 1 ...
##  $ Health22      : int  3 1 2 0 2 4 2 2 0 2 ...
##  $ Health23      : int  1 0 3 2 2 0 2 3 2 2 ...
##  $ Health24      : int  0 0 1 1 1 0 0 2 1 0 ...
##  $ Health25      : int  0 3 1 2 2 0 2 0 1 0 ...
##  $ Health26      : int  1 0 0 0 0 0 2 1 1 2 ...
##  $ Health27      : int  2 1 0 1 1 0 0 1 0 1 ...
##  $ Health28      : int  0 3 2 0 1 3 0 2 3 2 ...
##  $ Health29      : int  1 2 1 0 1 1 2 1 1 2 ...
##  $ Health30      : int  0 0 0 2 0 0 0 0 0 0 ...
##  $ Health31      : int  1 0 0 0 0 2 1 0 0 0 ...
##  $ Health32      : int  2 1 2 1 2 2 2 1 2 0 ...
##  $ Dep1          : int  0 0 2 0 1 1 1 1 0 0 ...
##  $ Dep2          : int  0 1 0 0 0 0 0 0 1 2 ...
##  $ Dep3          : int  0 1 0 0 0 0 1 0 2 2 ...
##  $ Dep4          : int  1 0 1 1 0 0 0 1 1 0 ...
##   [list output truncated]

Facebook$ID<-as.character(Facebook$ID)
Facebook$gender<-factor(Facebook$gender, labels=c("Male","Female"))
Rumination<-Facebook[,3:24]
Savoring<-Facebook[,25:48]
SatwithLife<-Facebook[,49:53]
CHIPS<-Facebook[,64:95]
CESD<-Facebook[,96:111]
Facebook$RRS<-rowSums(Facebook[,3:24])
Facebook$RRS_D<-rowSums(Facebook[,c(3,4,5,6,8,10,11,16,19,20,21,24)])
Facebook$RRS_R<-rowSums(Facebook[,c(9,13,14,22,23)])
Facebook$RRS_B<-rowSums(Facebook[,c(7,12,15,17,18)])
reverse<-function(max,min,x) {
  y<-(max+min)-x
  return(y)
  }
Facebook$Sav2R<-reverse(7,1,Facebook$Sav2)
Facebook$Sav4R<-reverse(7,1,Facebook$Sav4)
Facebook$Sav6R<-reverse(7,1,Facebook$Sav6)
Facebook$Sav8R<-reverse(7,1,Facebook$Sav8)
Facebook$Sav10R<-reverse(7,1,Facebook$Sav10)
Facebook$Sav12R<-reverse(7,1,Facebook$Sav12)
Facebook$Sav14R<-reverse(7,1,Facebook$Sav14)
Facebook$Sav16R<-reverse(7,1,Facebook$Sav16)
Facebook$Sav18R<-reverse(7,1,Facebook$Sav18)
Facebook$Sav20R<-reverse(7,1,Facebook$Sav20)
Facebook$Sav22R<-reverse(7,1,Facebook$Sav22)
Facebook$Sav24R<-reverse(7,1,Facebook$Sav24)
Facebook$SBI<-Facebook$Sav2R+Facebook$Sav4R+Facebook$Sav6R+
  Facebook$Sav8R+Facebook$Sav10R+Facebook$Sav12R+Facebook$Sav14R+
  Facebook$Sav16R+Facebook$Sav18R+Facebook$Sav20R+Facebook$Sav22R+
  Facebook$Sav24R+Facebook$Sav1+Facebook$Sav3+Facebook$Sav5+
  Facebook$Sav7+Facebook$Sav9+Facebook$Sav11+Facebook$Sav13+Facebook$Sav15+
  Facebook$Sav17+Facebook$Sav19+Facebook$Sav21+Facebook$Sav23
Facebook$SavPos<-Facebook$Sav2R+Facebook$Sav4R+Facebook$Sav6R+
  Facebook$Sav8R+Facebook$Sav10R+Facebook$Sav12R+Facebook$Sav14R+
  Facebook$Sav16R+Facebook$Sav18R+Facebook$Sav20R+Facebook$Sav22R+
  Facebook$Sav24R
Facebook$SavNeg<-Facebook$Sav1+Facebook$Sav3+Facebook$Sav5+
  Facebook$Sav7+Facebook$Sav9+Facebook$Sav11+Facebook$Sav13+Facebook$Sav15+
  Facebook$Sav17+Facebook$Sav19+Facebook$Sav21+Facebook$Sav23
Facebook$Anticipating<-Facebook$Sav1+Facebook$Sav4R+Facebook$Sav7+
  Facebook$Sav10R+Facebook$Sav13+Facebook$Sav16R+Facebook$Sav19+Facebook$Sav22R
Facebook$Moment<-Facebook$Sav2R+Facebook$Sav5+Facebook$Sav8R+
  Facebook$Sav11+Facebook$Sav14R+Facebook$Sav17+Facebook$Sav20R+Facebook$Sav23
Facebook$Reminiscing<-Facebook$Sav3+Facebook$Sav6R+Facebook$Sav9+
  Facebook$Sav12R+Facebook$Sav15+Facebook$Sav18R+Facebook$Sav21+Facebook$Sav24R
Facebook$SWLS<-rowSums(SatwithLife)
Facebook$CritR<-reverse(7,1,Facebook$Critical)
Facebook$AnxR<-reverse(7,1,Facebook$Anxious)
Facebook$ResR<-reverse(7,1,Facebook$Reserved)
Facebook$DisR<-reverse(7,1,Facebook$Disorganized)
Facebook$ConvR<-reverse(7,1,Facebook$Conventional)
Facebook$Extraversion<-(Facebook$Extraverted+Facebook$ResR)/2
Facebook$Agree<-(Facebook$CritR+Facebook$Sympathetic)/2
Facebook$Consc<-(Facebook$Dependable+Facebook$DisR)/2
Facebook$EmoSt<-(Facebook$AnxR+Facebook$Calm)/2
Facebook$Openness<-(Facebook$NewExperiences+Facebook$ConvR)/2
Facebook$Health<-rowSums(CHIPS)
Facebook$Dep4R<-reverse(3,0,Facebook$Dep4)
Facebook$Dep8R<-reverse(3,0,Facebook$Dep8)
Facebook$Dep12R<-reverse(3,0,Facebook$Dep12)
Facebook$CESD<-Facebook$Dep1+Facebook$Dep2+Facebook$Dep3+Facebook$Dep4R+Facebook$Dep5+
  Facebook$Dep6+Facebook$Dep7+Facebook$Dep8R+Facebook$Dep9+Facebook$Dep10+Facebook$Dep11+
  Facebook$Dep12R+Facebook$Dep13+Facebook$Dep14+Facebook$Dep15+Facebook$Dep16
library(dataMaid)

## Loading required package: ggplot2

To generate an overview report, we'll use the function, makeDataReport, which gives you a great overview of your dataset, with summary statistics, some analysis to assist with data cleaning, and customization options. And you can very easily add codebook information to your data report once it's in R Markdown format.

The makeDataReport function has many arguments, which you can read all about here. Today, we'll focus on some of the key arguments in the function. The very first argument is the dataset you want to perform the function on, in this case Facebook. You could run the function with only this argument if you wanted, and this might be fine if you have a small dataset and want to take a quick look.

Next, define what kind of file you want the function to output, "pdf", "word", or "html". The default, NULL, produces one kind of output based on three checks: 1) Is there a LaTeX installation available? If yes, PDF. If no: 2) Does the computer have Windows? If yes, Word. If no:, 3) HTML if the first two checks produces no's. After that is render, meaning R will render whatever output file you select, and save that file. If you plan on making changes to the R Markdown file after it is created, you want to set this to FALSE. The .Rmd file will automatically open, for you to edit, and will be saved in your working directory.

You can also specify file name, and/or a volume number (for updated reports) with vol=#. If you've already generated a report and try to create a new one without specifying a new file name, you'll get an error; you can override that and force dataMaid to save over the existing file with replace=TRUE.

Putting some of these arguments together will generate a report for that includes all variables in the dataset, with default summaries and visualization.

makeDataReport(Facebook, output="html", render=FALSE, file="report_Facebook.Rmd")

You can take a look at the example report I generated here. In fact, by running this report, I noticed some problems with the simulated dataset. Instead of using R to generate it for me, I used Winsteps (the Rasch program I use). I didn't notice until now that some of the items have values out of range for the rating scale used, such as Rumination item scores greater than 3. Thanks to this report, I identified (and fixed) these problems, then updated the simulated dataset available here. (This the same link used in previous posts; regardless of where you're clicking from, this will take you to the updated file.) I also noticed that I miscoded gender when creating that factor, and switched Male and Female. So I can go back up to that code and fix that as well.

But a report with every single variable might be more than I actually I want. I may want to specify only certain variables, such as those that appear problematic based on the built-in cleaning analysis dataMaid does. I can set that with onlyProblematic=TRUE:

makeDataReport(Facebook, output="html", render=FALSE, file="report_Facebook_sum.Rmd",
               onlyProblematic = TRUE)

You can check out that report here.

I always used "render=FALSE" so I could make changes to the report before knitting it to an HTML document. Since this opens an R Markdown file, I could add whatever information I wanted to the report, just by clicking to that point in the document and adding the information I want. For instance, I could add a note on the reverse function I used to reverse-code variables. I could include code for how I defined the gender factor - just to make certain I don't do it wrong again! You can add code to only display (rather than run) by putting ' on either side of it. I generated one last data report, using the cleaned data, with gender factor correctly coded, and added some codebook details. You can take a look at that document here.

Monday, April 16, 2018

N is for N (Sample Size) Estimation: Power Analysis in R

N Estimation We're pushing forward in the Blogging A to Z Challenge! Today, I'll talk about conducting power analyses in R with the pwr package. It's amazing the number of studies designed and conducted without the aid of power analysis. Once I learned how to do them in grad school, I couldn't see any other way to design a study - how else do you know how many people you need? Conducting a study is no guarantee you'll find a significant effect, but if there is one to be found, you want to make sure you're maximizing the chance that you'll find it. Otherwise, you might go to a lot of time and effort to do a study and then have nothing to show for it.

To estimate sample, you need a few pieces of information, one of which is the effect size you're hoping to detect. Later this month, I'll show how to conduct a meta-analysis in R, which is really useful to do on a small scale to generate this effect size and help inform a power analysis. I'll talk about that in those future posts, and I'm hoping to write an article for a methods journal going into detail about how to do a mini meta-analysis, using the one I did for my dissertation as a working example.

The other pieces of information you need, besides effect size, are:

The statistical analysis you plan to use.
Power - 1-Beta; convention is 0.8, or an 80% chance that, if there is a significant difference, you'll detect it, but 0.9 is sometimes used.
Significance level - what alpha you'll be using for your significance tests; 0.05 is the convention.

The pwr package allows you conduct power analysis for the following statistical analyses:

One-Sample Proportion: comparing a proportion to a population value
Two-Sample Proportion: comparing proportions from two samples with either equal or unequal n's
One-Sample T-Test: comparing a sample mean to a population mean
Paired-Sample T-Test: comparing means at two time points in one sample or from two matched samples
Two-Sample T-Test: comparing means from two samples with either equal or unequal n's
Chi-Square Test
One-Way ANOVA: comparing means from three or more samples
General Linear Model Test: including regressions and Factorial ANOVAs
Correlation

Each power analysis function requires 3 of the 4 pieces of information: effect size, sample size, significance level, and power. To estimate sample size, leave that piece of information out of the function, or set its value as NULL. But you could also use these functions to conduct a post-hoc power analysis - what was your power given the effect size, significance level, and sample size you used? To do that, you would just leave out power.

First, let's do a very simple power analysis. In last year's Blogging A to Z, I used an ongoing example of a (fictional) study on the impact of caffeine on test performance, and I demonstrated how to conduct a t-test using some data I generated. Descriptive statistics for the two groups showed:

Experimental group: M = 83.2, SD = 6.21
Control group: M = 79.3, SD = 6.40

I can compute an effect size - for two means, the effect size you'd use is Cohen's d.

caffeine<-read.delim("caffeine_study.txt", header=TRUE)
caffeine$group<-factor(caffeine$group, labels=c("Control","Experimental"))
library(effsize)

## Warning: package 'effsize' was built under R version 3.4.4

cohen.d(caffeine$score, caffeine$group, pooled=TRUE, conf.level=0.95)

## 
## Cohen's d
## 
## d estimate: 0.62398 (medium)
## 95 percent confidence interval:
##        inf        sup 
## 0.09471117 1.15324889

My effect size is about 0.624. Let's say you want to replicate my study, but rather than just picking a random number of people to have in each group, you want to determine how many you need to detect an effect of that size. For that, we'll use the pwr package.

library(pwr)

## Warning: package 'pwr' was built under R version 3.4.4

pwr.t.test(d=0.624, sig.level=0.05, power=0.8,
           type = "two.sample", alternative = "two.sided")

## 
##      Two-sample t test power calculation 
## 
##               n = 41.29776
##               d = 0.624
##       sig.level = 0.05
##           power = 0.8
##     alternative = two.sided
## 
## NOTE: n is number in *each* group

To detect an effect of that size, you want about 41 people per group, if you're conducting a two-sided test. If you instead want one-sided, you'd specify whether you expect your comparison group to be "less" or "greater".

pwr.t.test(d=0.624, sig.level=0.05, power=0.8,
           type = "two.sample", alternative = "greater")

## 
##      Two-sample t test power calculation 
## 
##               n = 32.45438
##               d = 0.624
##       sig.level = 0.05
##           power = 0.8
##     alternative = greater
## 
## NOTE: n is number in *each* group

So if you predict ahead of time that the experimental group will have a higher score, and only want to look for a significant difference in that direction, you need about 32 people per group. This tells me that the fake study I designed (and just randomly picked sample sizes for) was slightly underpowered, since I only had 30 people in each group. (Not to mention these are fake data I set up to have a very large effect. Effects as large as d = 0.624 aren't all that common in the wild.)

What if you wanted to do something a bit fancier, like estimate necessary sample size for a range of potential effect sizes? This next example is based on a program I recently wrote for a problem at work. (Values and some details have been changed.) A state board requires school programs to have a pass rate on a key exam of 80%. If a program misses that pass rate one year, they go on probation. If they miss it for 3 consecutive years, they are suspended and can no longer be considered a state board-approved school. The board has called up a statistician (hi) to see if a handful of school programs have a pass rate significantly lower than this 80% cutoff. The problem is, some of these school programs are small, and as a result, their pass rates may vary a lot from year to year, just by luck of the draw.

In addition to sharing a report of the study results, the statistician wants to include a power analysis, to show how many cases are necessary before she can reliably say a pass rate is too low. She wants to show necessary sample for a range of school program pass rates (say between 50% and 79%), and for two levels of power, 0.8 (convention) and 0.9. Here's how we could go about conducting that power analysis.

First, we need to generate our effect sizes. This is a one-sample proportion, where we're comparing a range of school pass rates (50-79%) to a population value (80%). In the pwr package, the effect size for this proportion is called h. You can access it with the pwr function, ES.h.

p1s <- seq(.5,.79,.01)
h <- ES.h(p1 = p1s, p2 = 0.80)
nh <- length(h)

Next, we'll create a list of our two power levels>

p <- c(.8,.9)
np <- length(p)

You'll notice I also requested the length of these lists. That becomes important now when I tell R to combine these two to start generating my target sample sizes, using a for loop.

sizes <- array(numeric(nh*np), dim=c(nh,np))
for (i in 1:np){
  for (j in 1:nh){
    pow_an <- pwr.p.test(n = NULL, h = h[j],
                         sig.level = .05, power = p[i],
                         alternative = "less")
    sizes[j,i] <- ceiling(pow_an$n)
  }
}

Now I have data in two columns, target sample sizes for each value of h as my rows, with power = 0.8 in one column and power = 0.9 in the other. I'll rename my columns, so I can easily see which is which, and I'll add in the school pass rates I used, so that I can plot my results with that information on the x-axis.

samp <- data.frame(cbind(p1s,sizes))
colnames(samp)<-c("Pass_Rate","Power.8", "Power.9")
summary(samp)

##    Pass_Rate         Power.8           Power.9        
##  Min.   :0.5000   Min.   :   15.0   Min.   :   21.00  
##  1st Qu.:0.5725   1st Qu.:   25.5   1st Qu.:   34.75  
##  Median :0.6450   Median :   51.0   Median :   71.00  
##  Mean   :0.6450   Mean   :  554.2   Mean   :  767.70  
##  3rd Qu.:0.7175   3rd Qu.:  167.2   3rd Qu.:  231.00  
##  Max.   :0.7900   Max.   :10075.0   Max.   :13956.00

For power of 0.8, necessary sample size ranges from 15 to over 10,000. For power of 0.9, the range is 21 to almost 14,000. Now I can start plotting my results from my samp data frame. If I only wanted to show results for one power level (0.8):

library(ggplot2)
passrates<-ggplot(samp, aes(x=p1s, y = Power.8)) +
  geom_point() +
  geom_line()
passrates + scale_y_continuous(breaks=seq(0,11000,1000)) +
  scale_x_continuous(labels = scales::percent) +
  labs(title = "Comparing Pass Rates Below 80%",
       x = "Pass Rate",
       y = "N Needed to Detect Significant Difference") +
       theme(plot.title=element_text(hjust=0.5))

If I wanted to plot both:

bothpowers<-ggplot(samp, aes(p1s)) +
  geom_line(aes(y=Power.8, colour="0.8")) +
  geom_line(aes(y=Power.9, colour="0.9")) +
  scale_y_continuous(breaks=seq(0,14000,1000)) +
  scale_x_continuous(labels = scales::percent) +
  labs(title = "Comparing Pass Rates Below 80%",
       x = "Pass Rate", y = "N Needed to Detect Significant Difference", colour="Power") +
  theme(plot.title=element_text(hjust=0.5))
bothpowers

It doesn't appear that bumping up to 0.9 power makes a huge difference in terms of sample size, except for the smallest comparison - 79% vs. 80%. But either way, the recommendation I would make to the state board is that pass rates should be 70% or less before we can say a program is reliably falling below the board's cutoff. This is especially true for smaller programs. Instead, for programs close to the cutoff, some form of remediation may be a better strategy. If they consistently fall below 70%, then drastic action could be taken.

You can learn more about the pwr package in its manual.

Tuesday, April 3, 2018

C is for Cross-Tabs Analysis

Title Cross-tabs is short for cross-tabulation (or cross tables), and are used to display frequencies for combinations of categorical and sometimes ordinal variables. In a previous blog post, I described chi-square, which is frequently used to analyze cross-tabs. As I said in that post, chi-square is a little like an ANOVA in how it examines the relationship between the two variables, but it is used to look at an association between two variables with two (and sometimes more) levels. Basically, you would use a non-parametric test like chi-square when working with variables that aren't continuous or that violate key assumptions of parametric tests.

Today, I'll demonstrate how to generate cross-tabs in R (which I also did in the previous post) and I'll show two ways to analyze cross-tabs: chi-square (again) and a similar test, Fisher's exact test.

Once again, I'll use my Facebook dataset to demonstrate. None of the variables in the set I used previously would really qualify for cross-tabs, since the variables in that dataset are meant to be combined into continuous scales. I included gender, but because of the student body makeup at the school where I collected my data, there are a lot of women and very few men - which isn't optimal for this type of analysis. But I can pull in some additional data collected in the demographics portion of the survey. The goal of the study was to examine Facebook usage patterns, so participants reported whether they used Facebook. The vast majority of the sample did. But they also reported use of other social media, including Twitter and LinkedIn.

A bit more than half the sample were freshmen, since the participant pool is drawn from Introductory Psychology, mostly taken by freshmen. The remaining participants were upper-class or non-traditional students. It makes sense that these two groups might have different usage patterns. That is, older students might be more likely to use LinkedIn, as they prepare for job searches and networking; non-traditional students are likely to already have a job or career. It's unclear, however, whether we might see a similar difference for Twitter users - though keep in mind, these data were collected in 2010, when Twitter may have been a different landscape. So let's generate two cross-tabs, both using the freshmen versus upper-class/non-traditional students (or younger versus older, for simplicity), one to look at Twitter use and one to look at LinkedIn use.

First, I'll read in that data, then redefine the variables I need as factors, which includes age2 (recoded from the continuous age variable) and indicators for using Twitter and LinkedIn. This gives labels to my cross-tabs. In order to make changes to these variables, I have to refer to these variables in my code, first to reflect that I want to change that variable (the information before the <-) and again when I reference what variable to make a factor. I use the dataset$variablename syntax to refer to a specific variable:

age_socialmedia<-read.delim(file="age_usage.txt", header=TRUE)
age_socialmedia$age2<-factor(age_socialmedia$age2, labels=c("Younger","Older"))
age_socialmedia$Twitter<-factor(age_socialmedia$Twitter, labels=c("Non-User","User"))
age_socialmedia$LinkedIn<-factor(age_socialmedia$LinkedIn, labels=c("Non-User","User"))

If I had wanted, I could have created a new variable in the first part of the code. If I wrote the name of a variable that doesn't exist in the dataset, it would be added. But since I'm not recoding, just adding labels, I have no issue with overwriting the existing variable.

I can generate my tables with the following:

Twitter<-table(age_socialmedia$age2, age_socialmedia$Twitter)
Twitter

##          
##           Non-User User
##   Younger      111   29
##   Older         91   25

LinkedIn<-table(age_socialmedia$age2, age_socialmedia$LinkedIn)
LinkedIn

##          
##           Non-User User
##   Younger      139    1
##   Older        109    7

Use of either social media site is not very high in this sample, but much too low for LinkedIn to use chi-square - one of the assumptions of that test is that no cells have counts less than 5. Fisher's exact test will work in that situation, though, so we can use chi-square for Twitter and Fisher's exact test for LinkedIn.

The code for either is very easy, especially if you named your tables:

chisq.test(Twitter)

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  Twitter
## X-squared = 9.2489e-05, df = 1, p-value = 0.9923

The Yates' continuity correction was developed because chi-square is biased to be significant when samples are large. We can easily turn this feature off, and most people do:

chisq.test(Twitter, correct=FALSE)

## 
##  Pearson's Chi-squared test
## 
## data:  Twitter
## X-squared = 0.026729, df = 1, p-value = 0.8701

In this case, the correction made little difference - yes, the p-value is smaller but neither is even close to being significant. So we'd conclude that, in this sample, there is no age difference in Twitter use.

Now let's conduct our Fisher's exact test, which as I mention above, can be used when there are cells with counts less than 5:

fisher.test(LinkedIn)

## 
##  Fisher's Exact Test for Count Data
## 
## data:  LinkedIn
## p-value = 0.02477
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##    1.112105 404.350276
## sample estimates:
## odds ratio 
##   8.863863

This test is significant; LinkedIn users tend to be older in this sample. In fact, older students are over 8 times more likely to be LinkedIn users than younger students in the present sample.

Tuesday, February 27, 2018

Terminal Degrees

From Saturday Morning Breakfast Cereal, this:

Friday, January 12, 2018

How Did I Get Here?: From the Desk of a Psychometrician

Psychometrics refers to an area of statistics focused on developing, validating, and standardizing tests, though it can also refer to a methodological process of creating tests, beginning with identifying the exact construct to be measured, all the way up to continuing revision and standardization of test items. As institutions become more and more data-driven, many are looking to hire psychometricians to serve as their measurement experts. This is an in-demand field. And psychometrics is still a rather small field, so there aren’t a lot of people with this training.

For this post, I focused on my own journey – since the only experience I know well is my own – but I plan to have future posts discussing other tracks colleagues (both past and present) have taken. My hope is to give you, the reader, some guidance on entering this field from a variety of starting places.

To tell you more about myself, my journey toward becoming a psychometrician could probably begin as early as 2002-2003, my junior year in college, when I first encountered the field of psychometrics. But it wasn’t until 2014 that I really aimed at becoming a psychometrician, through post-doctoral training. And though I’d been calling myself a psychometrician since at least 2014, I took my first job with the title “psychometrician” in 2016.

Of course, it may have begun even earlier than that, when as a teenager, I was fascinated with the idea of measurement – measures of cognitive ability, personality, psychological diagnoses, and so on. I suppose I was working my way toward becoming a psychometrician my whole life; I just didn’t know it until a few years ago.

But I may ask myself:

Undergrad

I majored in Psychology, earning a BS. My undergrad had a two-semester combined research methods and statistics course with a lab. While I think that really set the foundation for being a strong statistician and methodologist today, I also recognize that it served to weed out many of my classmates. When I started in Research Methods I my sophomore year, I had over 30 classmates – all psychology majors. By Research Methods II, we were probably half that number. And in 2004, I graduated as 1 of 5 psychology majors.

During undergrad, I had two experiences that helped push me toward the field of psychometrics, both during my junior year. I completed an undergraduate research project – my major required either a research track (which I selected), involving an independent project, or a practitioner track that involved an internship. This project gave me some hands-on experience with collecting, analyzing, and writing about data, and is where I first learned about exploratory factor analysis, thanks to the faculty sponsor of my research. And that year, I also took a course in Tests & Measures, where I started learning about the types of measures I would be working on later as a psychometrician.

At this point in my career, I wasn’t sure I wanted to go into research full-time, instead thinking I’d become a clinical psychologist or professor. But I knew I enjoyed collecting and (especially) analyzing data, and this is when I first started spending a lot of time in SPSS.

Spoiler alert: I use SPSS in my current job, but it’s not my favorite and in past jobs, I've spent much more time in R and Stata. But it’s good to get exposed to real statistical software¹ and build up a comfort level with it so you can branch out to others. And getting comfortable with statistical software really prepared me for…

Grad School

After undergrad, I started my masters then PhD in Social Psychology. I had applied to clinical psychology programs as well, but my research interests and experience (and lack of clinical internship) made me a better fit with social psychology. My goal at that time was to become a professor. Since I had a graduate assistantship, I got started on some research projects and began serving as a teacher assistant for introductory psychology and statistics. This is where I began to fall in love with statistics, and where I first had the opportunity to teach statistics to others. Many students struggled with statistics, and would visit me in office hours. Working to figure out how to fix misunderstandings made me learn the material better, and turning around and teaching it to others also improved my understanding.

I took basically every statistics course I could fit into my schedule, and tacked on a few through self-designed candidacy exams, workshops at conferences, and self-learning for the fun of it: multivariate statistics, structural equation modeling (including an intermediate/advanced 3-day workshop paid for with monetary gifts from Christmas and birthday), meta-analysis, power analysis, longitudinal analysis, and so on and so on. Surprisingly, I didn't delve into modern test theory and I can't remember if I'd even heard of item response theory or Rasch at this point, but I was only looking around the psychology scene. Those approaches hadn’t reached a tipping point in psychology yet and, honestly I’m not sure they’ve reached it yet even now, but we’re much closer than we were a decade ago.

I also became more exposed to qualitative research methods, like focus groups and interviews, and how to analyze that kind of data, first through a grant-funded position shortly after I finished my masters degree, and then, during my last year of graduate school, as a research assistant with the Department of Veterans Affairs. (By that point, I’d given up on being a professor and decided research was the way I wanted to go.) And that position as a research assistant led directly to…

Post Doc #1

Post-doc #1 was part of a program to train health services researchers. But I had some freedom in designing it, and devoted my time to beefing up my methods skills, becoming someone who could call herself an experienced qualitative (and mixed methods) researcher. I trained people on how to conduct focus groups. I coordinated data collection and transcription of hundreds of focus groups and interviews over multiple studies. I analyzed a ton of qualitative data and got more comfortable with NVivo. I brought innovative qualitative research methods to VA projects. And I published a lot.

A bit before my last year of post-doc #1, I became involved with a measurement development project using Rasch. I was brought on board for my qualitative expertise: to conduct and analyze focus groups for the first step in developing the measure. I was intrigued. I knew all about classical test theory, with its factor analysis and overall reliability. But this approach was different. It gave item-level data. It converted raw scores, which it considered ordinal (something I’d never thought about), into continuous variables. It could be used to create computer adaptive tests and generate scores for two people that are on the same metric even if the two people had completely different items. It was like magic.

But better than magic – it was a new approach to statistics. And as with statistical approaches I’d encountered before that, I wanted to learn it.

Fortunately, the Rasch expert for the study was a friend (and a colleague who thought highly of me and my abilities), and she took the time to sit down and explain to me how everything worked. She let me shadow her on some of the psychometric work. She referred me to free webinars that introduced the concepts. She showed me output from our study data and had me walk through how to interpret it. And most of all, she encouraged me to learn Rasch.

So I wrote a grant application to receive…

Post Doc #2

That’s right, I did 2. Because I’m either a masochist or bat-s*** crazy.

But this time, I had a clear goal I knew I could achieve: the goal of becoming a full-fledged psychometrician. As such, this post-doc was much more focused than post-doc #1 (and was also shorter in length). I proposed an education plan that included courses in item response theory and Rasch, as well as a variety of workshops (including some held by the Rehabilitation Institute of Chicago and the PROMIS project). I also proposed to work with a couple of experts in psychometrics, and spent time working on research to develop new measures. I already had the qualitative research knowledge to conduct the focus groups and interviews for the early stages of measure development, and I could pull from the previous psychometric study I worked on to get guidance on the best methods to use. In addition to my own projects, I was tapped by a few other researchers to help them with some of their psychometric projects.

Unfortunately, the funding situation in VA was changing. I left VA not long after that post-doc ended, in part because of difficulty with funding, but mainly, I wanted to be a full-time psychometrician, focusing my time and energy on developing and analyzing tests.

Fortunately, I learned about an opportunity as…

Psychometrician at Houghton Mifflin Harcourt

When I first looked at the job ad, I didn’t think I was qualified, so I held off on applying. I think I was also unsure about whether a private corporation would be a good fit for me, who had been in higher education and government for most of my career at that point. But a friend talked me into it. What I learned after the interview is that 1) I am qualified and that self-doubt was just silly (story of my life, man), and 2) the things I didn’t know could be learned and my new boss (and the department) was very interested in candidates who could learn and grow.

Fortunately, I had many opportunities to grow at HMH. In addition to conducting analyses I was very familiar with – Rasch measurement model statistics, item anchoring, reliability calculations, and structural equation modeling and other classical test theory approaches – I also learned new approaches – classical item analysis, age and grade norming, quantile regression, and anchoring items tested in 2 languages. I also learned more about adaptive testing.

A few months after I started at HMH, they hired a new CEO, who opted to take the company in a different direction. In the months that followed, many positions were eliminated, including my own. So I got a 2 month vacation, in which I took some courses and kept working on building my statistics skills.

Luckily, a colleague referred me to…

Manager of Exam Development at Dental Assisting National Board

DANB is a non-profit organization, which serves as an independent national certification board for dental assistants. We have national exams as well as certain state exams we oversee, for dental assistants to become certified in areas related to the job tasks they perform: radiation safety (since they take x-rays), infection control, assisting the dentist with general chairside activities, and so on. My job includes content validation studies, which involves subject matter experts and current dental assistants to help identify what topics should be tested on, and standard setting studies, once again involving subject matter experts to determine pass-points. I just wrapped up my first DANB content validation study and also got some additional Rasch training, this time in Facets.

Going forward, I’m planning on sharing more posts about the kinds of things I’m working on.

For the time being, I’ll say it isn’t strictly necessary to beef up on methods and stats to be a psychometrician – I’m an anomaly in my current organization for the amount of classical stats training I have. And in fact, psychometrics, especially with modern approaches like item response theory and Rasch, is so applied, there are many psychometricians (especially those who come at it from one of those applied fields, like education or healthcare) without a lot of statistics knowledge. It helps (or at least, I think it does) but it isn’t really necessary.

That's what brought me here! Stay tuned for more stories from fellow psychometricians in the future. In the meantime, what questions do you have (for me or other psychometricians)? What would you like to learn or know more about? And if you're also a psychometrician (or in a related field), would you be willing to share your journey?

¹Excel, while a great tool for some data management, is not real statistical software. It's not unusual for me to pull a dataset together in Excel, then import that data into the software I'll be using for analysis.

Tuesday, November 7, 2017

Is DeVos On Her Way Out?

I've said a lot about my thoughts on Betsy DeVos before, so I won't repeat all of that. But officials believe she's going to resign:

Thomas Toch, director of independent education think tank FutureEd, told Politico that DeVos was ignorant of the job's constraints when she accepted it and insiders are already preparing for her to vacate the position.

DeVos was roundly criticized for her lack of basic knowledge about education policy during the confirmation hearing process. She blames President Donald Trump's transition team, claiming she was "undercoached."

Do you believe in miracles? Because I might now.

Seriously, though: anyone else find it ironic that an administration that claims it values meritocracy and hates "entitled snowflakes" is full of people who were handed money and power, and constantly blame other people for their own shortcomings and shitty performance?

Friday, October 20, 2017

Reviewer 2 Could Be Anybody

We all dread getting reviewer 2 on our research articles and grant proposals. Now imagine a world where anyone with an ax to grind could be reviewer 2. Rand Paul wants us to live in that world:

Senate Republicans have launched a new attack on peer review by proposing changes to how the U.S. government funds basic research.

New legislation introduced this week by Senator Rand Paul (R–KY) would fundamentally alter how grant proposals are reviewed at every federal agency by adding public members with no expertise in the research being vetted. The bill (S.1973) would eliminate the current in-house watchdog office within the National Science Foundation (NSF) in Alexandria, Virginia, and replace it with an entity that would randomly examine proposals chosen for funding to make sure the research will “deliver value to the taxpayer.” The legislation also calls for all federal grant applications to be made public.

Paul’s proposed solution starts with adding two members who have no vested interest in the proposed research to every federal panel that reviews grant applications. One would be an “expert … in a field unrelated to the research” being proposed, according to the bill. Their presence, Paul explained, would add an independent voice capable of judging which fields are most worthy of funding. The second addition would be a “taxpayer advocate,” someone who Paul says can weigh the value of the research to society.

I mean, we can give him the benefit of the doubt, since we know the Republicans have no idea what happens when they put someone with zero experience into power.</sarcasm>

I've already commented on how incredibly dangerous it is to allow political parties to serve as gatekeepers to education and the media. Now imagine they also serve as gatekeepers of what scientific research is acceptable. And it easily opens the door to grant decisions that are entirely politically motivated, by giving this new office full veto power. That is, with a flick of a pen, it can overrule any decision made by this committee.

Are you scared yet?

Monday, August 7, 2017

Cats Are A**holes (And Have Been for a While)

Via a Facebook group I belong to, Reviewer 2 Must Be Stopped - this article about the history of cats in academia:

A couple of years ago, Emir Filipović, from the University of Sarajevo, was trawling through the Dubrovnik State Archives when he stumbled upon a medieval Italian manuscript (dated 1445) marked with four very clear paw prints.

It could have been worse: around 1420, one scribe found a page of his hard work ruined by a cat that had urinated on his book. Leaving the rest of the page empty, and adding a picture of a cat (that looks more like a donkey), he wrote the following:

Here is nothing missing, but a cat urinated on this during a certain night. Cursed be the pesty cat that urinated over this book during the night in Deventer and because of it many other cats too. And beware well not to leave open books at night where cats can come.

For us humans though, the most interesting, and pressing, cat research investigates their propensity for spreading mind-controlling parasites. You see, cats carry a parasite called Toxoplasma gondii, which alters the behaviour of animals to make them less afraid of predators (and therefore more likely to be killed, eaten and used as a conduit for further propagation of the parasite).

The slightly eccentric Czech scientist Jaroslav Flegr has made such research his life’s work, and since a “light bulb” moment in the early 1990s he has been investigating the link.

We’ve known for years that infection is a danger during pregnancy and a major threat to people with weakened immunity, however the research of Flegrs and others suggests that infected humans are more likely to be involved in car crashes caused by dangerous driving and are more susceptible to schizophrenia and depression. And all that is to say nothing of the as-yet-unexplained link between cat bites and depression.

It's not all anti-cat. There's a great story about a cat who was named coauthor on a paper in physics, and as a result, got many invitations to speak at conferences and even an academic appointment.