Wednesday, January 31, 2018

Statistical Sins: Types of Statistical Relationships

On this blog, I've covered a lot of statistical topics, and have tried to make them approachable even to people with little knowledge of statistics. But I admit that, recently, I've been covering more advanced topics. But there are still many basic topics to explore, that could be helpful for non-statistical readers, as well as teachers of statistics who address these topics in courses. This topic was prompted by a couple of discussions - one late last week, and another at dance class last night.

The discussions dealt with what it means to say two variables are related to each other (or not), including whether that relationship is weak, moderate, or strong, and whether it is positive or negative. I first addressed this topic when I wrote about correlation for my 2017 April A to Z. But let's really dive into that topic.

You may remember that correlation ranges from -1 to +1. The closer the value is to 1 (positive or negative), the stronger the relationship between the two variables. So in terms of strength, it is the absolute value that matters. A correlation close to 0 indicates no relationship between the variables.

A common question I see is whether a correlation is weak, moderate, or strong. Different people use different conventions, and there isn't a lot of agreement on this topic, but I frequently see a correlation of 0.5 referred to as strong. As I explained in my post on explained variance, this means the two variables share about 25% variance, or more specifically, about 25% of the variation in one variable can be explained by the other variable.

For moderate, I often see 0.3 (or 9% shared variance) and for weak, 0.1 (or 1% shared variance). But as I said, these really aren't established conventions (please don't @ me - thanks in advance). You could argue that there are a variety of factors that influence whether a relationship is seen as weak, moderate, or strong. For instance, study methods could have an impact. Finding a correlation of 0.5 between two variables I can directly manipulate and/or measure in an experimental setting is completely different from finding a correlation of 0.5 between two variables simply measured in a natural setting, where I have little to no control over confounds and high potential for measurement error. And we are often more generous about what we consider strong or weak in new areas of research than in well-established topics. But conventions give you a nice starting point, that you can then shift as needed depending on these other factors.

Direction of the relationship is indicated by the sign. A positive correlation means a positive relationship - as scores on one variable go up, so do scores on the other variable. For example, calories consumed per day and weight would be positively correlated.

A negative correlation means a negative relationship - as scores on one variable go up, scores on the other variable go down. For instance, minutes of exercise per day and weight would probably have a negative correlation.

[Now, for that last relationship, you might point out that there are a variety of other factors that could change that relationship. For example, some exercises burn fat, lowering weight, while others build muscle, which might increase weight. I hope to explore this topic a bit more later: how ignoring subgroups in your data can lead you to draw the wrong conclusions.]

Part of the confusion someone had in our discussion last night was knowing the difference between no relationship and a negative relationship. That is, they talked about how one variable (early success) had no bearing on another variable (future performance). They quickly pointed that this doesn't mean having early success is bad - the relationship isn't negative. But I think there is a tendency for people unfamiliar with statistics to confuse "non-predictive" with "bad".

So let's demonstrate some of these different relationships. To do that, I've generated a dataset in R, using the following code to force the dataset to have specific correlations. (You can use the code to recreate on your own.) The first thing I do is create a correlation matrix, which shows the correlations between each pairing of variables, that reflect a variety of relationship strengths and directions. Then, I impose that correlation matrix onto a randomly generated dataset. Basically R generates data that produces a correlation matrix very similar to the one I defined.

R1 = c(1,0.6,-0.5,0.31,0.11)
R2 = c(0.6,1,-0.39,-0.25,0.05)
R3 = c(-0.5,-0.39,1,-0.001,-0.09)
R4 = c(0.31,-0.25,-0.001,1,0.01)
R5 = c(0.11,0.05,-0.09,0.01,1)

R = cbind(R1,R2,R3,R4,R5)
U = t(chol(R))
nvars = dim(U)[1]
numobs = 1000
set.seed(36)
random.normal = matrix(rnorm(nvars*numobs,0,1),nrow = nvars, ncol=numobs)
X = U %*% random.normal
newX = t(X)
raw = as.data.frame(newX)
names(raw) = c("V1","V2","V3","V4","V5")
cor(raw) 


The final command, cor, which requests a correlation matrix for the dataset, produces the following:

           V1          V2         V3          V4          V5
V1  1.0000000  0.57311834 -0.4629099  0.31939003  0.10371136
V2  0.5731183  1.00000000 -0.3474012 -0.26425660  0.04838563
V3 -0.4629099 -0.34740123  1.0000000  0.01204920 -0.12017036
V4  0.3193900 -0.26425660  0.0120492  1.00000000  0.01202121
V5  0.1037114  0.04838563 -0.1201704  0.01202121  1.00000000

Just looking down the columns, you can see that the correlations are very close to what I specified. The one exception is the correlation between V3 and V4 - I asked for -0.001 and instead have 0.012. Probably, R couldn't figure out how to generate data with that correlation while also coming close to the other values I specified. So it had to fudge this one a bit.

So now, I can plot these different relationships in scatterplots, to let you see what weak, moderate, and strong relationships look like, and how direction changes the appearance. Let's start by looking at our weak correlations, positive and negative. (Code below, including lines to add a title and center it.)

library(ggplot2)

weakpos<-ggplot(raw,aes(x=V1,y=V5)+geom_point()
+labs(title="Weak Positive")
+theme(plot.title=element_text(hjust=0.5))

weakneg<-ggplot(raw,aes(x=V3,y=V5)+geom_point()
+labs(title="Weak Negative")
+theme(plot.title=element_text(hjust=0.5))

That code produces these two plots:


With similar code (just switching out variables in the x= and y= part), I can produce my moderate plots:


and my strong plots:

As you can see, even a strong relationship looks a bit like a cloud of dots, but you can see trends that go from almost nonexistent to more clearly positive or negative. You can make the trends a bit easier to spot by adding a trendline. For example:

strongpos+geom_smooth(method=lm)


The flatter the line, the weaker relationship. Two variables that are unrelated to each other (such as V4 and V5) will have a horizontal line through the scatter:


I'll come back to this topic Sunday (and in the future post idea I mentioned above), so stay tuned!

Tuesday, January 30, 2018

Interesting Analysis on Reasons for Congressional Resignations - And the Impact of #MeToo

According an interesting article from FiveThirtyEight, more people are resigning from Congress recently than we've seen at any other point in the last 117 years:
The 115th Congress owes its historic turnover to the confluence of two events, one normal and one abnormal. First, there’s the start of a new presidential administration. Five of the first six members to resign this session did so to accept jobs in President Trump’s administration. That’s not unusual. It’s similar to the seven members who resigned in 2009 to join the Obama administration and the five members who left in 1993 to join Bill Clinton’s.

But in addition, three of the four most recent members to resign from the 115th Congress did so because they were accused of unwanted sexual advances: John Conyers, Trent Franks and Al Franken.
The article features this graph, which shows number of resignations and reasons over time, annotated with major events occurring at that time:

Sunday, January 28, 2018

Statistics Sunday: Quantile Regression (Part 1)

About a month ago, I introduced the concept of quantiles - ways of dividing up the distribution of scores into certain percentile groups. The median is one of the most well-known quantiles - it is the score that divides the distribution of scores in half, so that half of the scores in the distribution are below it and half are above. Unlike mean and standard deviation, which are most useful when scores fall along a normal distribution, quantiles can be used to describe just about any distribution (although visualization of your data distribution is still very important).

Regression is a useful technique when you want to predict an outcome based on one or more variables thought to influence that outcome. The most well-known type of regression is linear regression, where the relationship between two variables can be plotted on scatterplot with a straight line. So a linear relationship is one important assumption when using linear regression. You wouldn't want to use it if you believe the relationship between your two variables follows a curve.

But linear regression has other assumptions, one of which is that the variance of scores is consistent across the line. For instance, take a look at this scatterplot with a prediction line drawn through it:


This is the scatterplot I used in my scatterplot post, which was created using data from my study of Facebook use. Most of the dots don't fall directly on the line. And that's to be expected, because even strong predictors can't predict an outcome perfectly. When we say the relationship is linear, we are referring to the trend. But scores are going to vary and differ from our predicted score - sometimes because of other variables we didn't measure, and sometimes because of idiosyncrasies in the data. But to use linear regression, we have to meet an assumption called "homoscedasticity." That means that the variance of scores at one point on the line should be equal to the distribution of scores at other points along the line. You would want the scatter of the scatterplot to be fairly consistent across the chart. If you have a plot that looks more like this:


you're likely violating an assumption. The scores are more spread out in the upper-right corner than they are in the lower-left corner.

When I was a psychometrician at Houghton Mifflin Harcourt, we regularly worked with data that had a curved relationship between variables. That is, rather than following a straight line, the best way to describe the relationship between two variables was with a curved line - specifically, a growth curve, where we see rapid increases between two correlated variables, followed by a flattening out. There is a type of regression - polynomial regression - that allows us to model curvilinear relationships. But there was another interesting observation in these growth curves: the variance in scores at the low end was much greater than at the high end. Standard regression wouldn't do. We needed a regression approach that could handle heteroscedasticity.

Enter the quantile regression. This approach has no assumptions on distribution of scores across the regression line, because it actually gives you multiple regression lines - one for each quantile. That is, the outcome variable is divided up into however many quantiles are necessary. You could use quartiles for instance: the scores that divide the distribution into evenly sized quarters. The analysis then produces one regression equation for each division of the scores, which provides the scores that fall within each quantile. This approach is useful for situations in which the relationship between variables is not quite linear and/or when the variance is not quite consistent.

Next week, I'll provide a real example of quantile regression, and show how to conduct this analysis in R, using the quantreg package. Stay tuned!

Friday, January 26, 2018

Psychology for Writers: On Fight, Flight, and Fear Responses

This will be a new series on my blog, coming out on Fridays. (I may not post one every single week, but they'll always come out on a Friday.) My goal is to impart some wisdom from my discipline - psychology - to help writers, in part because I see a lot of tropes over and over that are not really in line with what we know about human behavior. (And I have so many writer friends with their own expertise - I'd love to see others share some domain knowledge like this to help us write more accurately!)

One of the big tropes I see is how people respond to extreme fear, usually by soiling themselves in some way. George R.R. Martin just loves to have his characters pee and crap themselves, but he's certainly not the only offender. In fact, I just finished Sleeping Beauties by Stephen King (one of my favorite authors) and Owen King, and was really frustrated when once again, the peeing-oneself-in-fear trope reared its ugly head.

Do people pee themselves in fear? Yes they do. It probably doesn't happen anywhere near as often as it does in books, but then we don't usually encounter some of literature's most fear-inducing creatures in real life. So we'll say for the moment that the frequency with which people pee themselves is fine. In fact, that's not even the issue I have.

The issue I have is when this response occurs. In the books (and movies & TV series too), something scary shows up and the character immediately wets him or herself. Nope. That's not when it happens. It would happen after the fear moment has passed - after the creature has been beaten or moved onto another target, just as the character is beginning to breath a sigh of relief. That's right - you pee after you've been afraid, not during.

How could that be? Let me talk to you for a moment about the fight vs. flight response, for which I've created this visual aid:

Regardless of whether you're going into battle or running way, your body response is the same: in response to some stressor, your body is raring up to go. Specifically, the sympathetic nervous system is getting you ready to expend energy, because fighting or flighting, you're going to be expending energy. Your body releases chemicals, like epinephrine (adrenaline), norepinephrine, and cortisol. Your heart rate and blood pressure increase. Your pupils dilate. Your muscles get ready for action.

But as these systems are charging up, other systems are slowing or shutting down completely. Your immune response is suppressed. Sexual arousal is suppressed. Digestion slows or may even stop. And your body stops producing waste materials, like urine. It's not very helpful if, just as you're getting ready to fight for your life, you realize you're hungry or need to pee, or if your allergies act up and you've suddenly got itchy eyes, sneezes, and a runny nose. Those feelings go away, temporarily, so you can focus your energy on the surviving.

But after the stressful period ends, your parasympathetic nervous system takes over. Unlike the sympathetic nervous system's "fight or flight" response, the parasympathetic nervous system's response is "rest and digest." (I've also heard "feed and breed," because of the aforementioned impact on sexual arousal.) Those processes that were slowed or stopped start up again. Your muscles relax. You can imagine where this is going.

The best portrayal of this concept I've ever seen was in The Green Mile: the scene where Wild Bill, one of the inmates on death row, grabs Percy, one of the guards, and threatens him. The other guards get Wild Bill to release Percy, who rushes to the other side of the corridor and collapses on the floor. Wild Bill says with a smirk that he wouldn't have hurt Percy. And that's when Del, a fellow inmate, points out that Percy has just wet himself.

Percy is away from Wild Bill and safe. While the emotions he was experiencing when Wild Bill grabs him are still present (he's still upset), his body is returning to its previous physiological state. And that's when his bladder lets go.

So there you have it - if you really want your character to wet him- or herself, it should happen after the character begins "coming down."

I know one question I've gotten when discussing this in the past is, why do animals pee on someone/something in fear? Remember, animals don't have the same highly evolved nervous system that we have. Their systems are not going to shut down in the same way a human's does. Also, their reaction is not directly to fear, but is (arguably) an evolved response that is pretty effective in getting predators to let them go. Sure, an animal could bite instead, but that's aggressive and likely to earn an aggressive response. These animals are generally built for flight, not fight, and peeing is a submissive response. But for humans and other creatures higher on the food chain, being an apex predator means we'll have much different responses than animals that are not apex predators.

Bonus: you could use this information to portray the opposite. Sometimes authors want to show how badass their character is by having them be completely unfazed by something another character finds terrifying. You could have your character walking into battle commenting that he/she is starving and hopes to get something to eat after finishing with this nonsense, or something along those lines.

I know I blog about statistics a lot on this blog, but I'm always thrilled to talk about my discipline too! Any psychology topics you'd like to see addressed here? (Or any tropes you're tired to seeing?)

Wednesday, January 24, 2018

Data Science and Trump's Tweets

A while back, I shared a post by David Robinson of Variance Explained, where he conducted a sentiment analysis on Trump's tweets. He was able to find differences in tweet content and sentiment, and showed that you could tell which tweets Trump wrote and which ones were written by someone else.

Recently, David Robinson was featured on the Data Crunch podcast, talking about his work. You can listen through the embedded audio below, or head over to this page to find out more.

Statistical Sins: Facebook's Search for Trustworthy News Sources

You're heard the story already: fake news - actual fake news, and not just what Trump calls fake news - has been propagating on social media networks. I've encountered so much in my networks, I've begun using Snopes as a verb. E.g., "Can you Snopes this, please?" In fact, fake news may have had real-world consequences, perhaps even influencing the results of elections.

Fake news has been able to propagate, not simply because of people who spread what they knew to be fake, but because many (likely well-meaning) people bought it and shared it.

Which is why Facebook's response to this issue is so ridiculous:
Last week, Facebook said its News Feed would prioritize links from publications its users deemed "trustworthy" in an upcoming survey. Turns out that survey isn't a particularly lengthy or nuanced one. In fact, it's just two questions.

Here is Facebook's survey — in its entirety:

Do you recognize the following websites

  • Yes
  • No

How much do you trust each of these domains?

  • Entirely
  • A lot
  • Somewhat
  • Barely
  • Not at all

A Facebook spokesperson confirmed this as the only version of the survey in use. They also confirmed that the questions were prepared by the company itself and not by an outside party.
That's right, Facebook intends to protect people from fake news by asking the very people who helped spread that news what sources they find trustworthy. Do you see the problem with this scenario? Because the leadership at Facebook certainly doesn't.

Yesterday evening, I went to my first meeting of an advisory board for an applied psychology accelerated bachelors program for adult learners. During that meeting, we were asked what skills and knowledge would be essential for someone coming out of such a program. One of my responses was that, a skillset from my training I've had to use in every job and many of volunteer experiences has to do with creating and fielding surveys. There is an art and a science to surveying people, and there are ways to write questions that will get useful data - and ways that will get you garbage. Facebook's survey is going to give them garbage.

Even if you forget about the countless people who, every day, mistake well-known satirical news sites (like the Onion) as genuine, not every site is clear on whether it is purporting itself to be real news or simply entertainment - and let's be honest, where do you draw that line between informing and entertaining? How do you define something as trustworthy or not? And how might variation in how people define that term influence your data? Many year ago, when Jon Stewart was still on The Daily Show, I remember a commercial in which they shared that more Americans get their news from The Daily Show than anywhere else, to which Stewart replied, "Don't do that! We make stuff up!" Even though they were forthcoming about this, people still considered them trustworthy.

The real issue is when people can't tell the difference. So now you're fixing a problem caused by people being unable to tell the difference by asking people to tell the difference. At best, the survey will produce such inconsistent data, it won't have any influence on what links can and can't be shared. At worst, the same biases that caused fake news to be shared to begin with will be used to deem sites trustworthy or not. And having the Facebook stamp of trustworthy could result in even more harm.

Honestly, information campaigns to make people more skeptical would be a much better use of Facebook's time and resources.

Monday, January 22, 2018

Hey You Guys/Y'all/Youse Guys!

This morning, a member of one of my Facebook writing groups shared a dialect quiz. A series of 25 questions asks your preferred term for things like addressing a group of people, discussing the evening meal, or enjoying a sugar-sweetened carbonated beverage. Your responses generate a heat map, showing how similar your dialect is to different parts of the country:
The data for the quiz and maps shown here come from over 350,000 survey responses collected from August to October 2013 by Josh Katz, a graphics editor for the New York Times who developed this quiz and has since written “Speaking American,” a visual exploration of American regional dialects.

Most of the questions used in this quiz are based on those in the Harvard Dialect Survey, a linguistics project begun in 2002 by Bert Vaux and Scott Golder. The original questions and results for that survey can be found on Dr. Vaux’s current website.
I was raised in Kansas City, but have lived in Chicago for 14 years. I figured my dialect had changed in my time away, but it turns out you can take the girl out of Kansas City, but you can't take Kansas City out of the girl:

Sunday, January 21, 2018

Statistics Sunday: Violin Plots

Last week, I described box plots, a staple of introductory statistics courses. But Paul Hanel, a post-doc at the University of Bath, was kind enough to share with me a better way of visualizing data than a box plot: a violin plot.

I'd honestly never heard of a violin plot before, but it combines the best qualities of two forms of data display: the box plot and the probability density plot (the shape of the distribution; see this post on the histogram).

To demonstrate the violin plot, I'll be using R and the ggplot2 package. ggplot2 and other members of the so-called tidyverse in R, like dplyr, are essential to creating tidy objects. A tidy dataframe, for instance, is one in which each column is a variable and each row is a case. ggplot2 gives you a lot of control over the appearance of your graphics and is really the best way to go in creating publication quality graphics. Sometimes I get lazy and just use the R base, like when I simply want to quickly visualize data so I can move onto the next step. (Yes, yes, I know ggplot2 also has some qplot (quickplot) options for that very purpose.)

So let's demonstrate the violin plot using my own reading data - this is a dataset of the 53 books I read in 2017. In this tidy dataset, each case is a book, and each book has multiple variables addressing things like page length, genre, and author gender. I'll read my data in and also make sure some of the key variables are converted to factors - this will become important when I start creating my violin plots.

books<-read.csv("2017_books.csv", header = TRUE)
books$Fiction<-factor(books$Fiction,labels=c("Nonfiction", "Fiction"))
books$Fantasy<-factor(books$Fantasy, labels=c("Non-Fantasy", "Fantasy"))


A couple of bar plots show me that I was almost evenly split on fiction v. nonfiction (I read 30 fiction and 23 nonfiction) and that I read 19 fantasy books (approximately two-thirds of my fiction books).

library(ggplot2)
fiction<-ggplot(books, aes(Fiction))+geom_bar() fiction+labs(title="Type of Books Read in 2017")+ 
 ylab("Number of Books")+
 xlab("Type of Book")



fantasy<-ggplot(books, aes(Fantasy))+geom_bar() fantasy+labs(title="Fantasy Books Read in 2017")+ 
 ylab("Number of Books")+
 xlab("Genre")



I could use either of these variables to better understand my sample of books. For instance, I gave fantasy books higher ratings and also gave longer books higher ratings. But how might genre and page length relate to each other? I could potentially visualize number of pages for fantasy and nonfantasy books.

Rather than just typing out the syntax (as I did above), I'll walk through it. First up, always name your objects in R, whatever they may be - data, graphics, etc. In the case of the graphics above, it makes it easier to add on some formatting, and since ggplot2 can be used to do even more interesting things with formatting, your code can get rather long. Mostly, it's a good habit to get into. As Vanessa Ives says in Penny Dreadful (which I've just started watching), "you have to name a thing before it comes to life."

Next, we're using the ggplot syntax to create these graphics. As I mention above, there are also qplots, that have fewer options. qplots are good for quick visuals to check in on the data. ggplot gives you a lot more options and is good for more complex plots and creating publication quality images. In parentheses, we name the dataset we'll be using, books, and in the nested function, aes, which is short for aesthetics, we'll dictate the variables we're using in the display and how to use variable to color code results. Unlike the barplots above, we'll be using two variables in our display, so we need to define them both. Finally, + geom_ tells R which type of visualization we're using. In this case, we'll be using + geom_violin. As you can see, you can embed further aes information within the geom_ as well.

fantlength<-ggplot(books, aes(x = Fantasy, y = Pages) + geom_violin(trim = FALSE, aes(fill = Fantasy))

I'm not finished formatting my graphic yet, but here's our first look at the violin plot.


The typical fantasy book seems to be slightly longer than nonfantasy books, but not by much, and nonfantasy books have a wider range. Of course, I should notes that right now, these violin plots are really just sideways histograms. To complete them, I need to layer a boxplot on top of them. Since I named my object, I can just add that on.

fantlength+geom_boxplot(width=.3)


The median for book length is similar but slightly higher for Fantasy books. They have similar interquartile range, but the nonfantasy books have a wider overall range and more outliers. Its not a very surprising conclusion that the nonfantasy books are a much more heterogeneous group, including both fiction and nonfiction books across a wide variety of genres.

Really digging into ggplot2 would take more than one post - in fact multiple - but I should at least show at this point how to do a few things you'd need if you wanted to present this violin chart somewhere. That is, you'd want to add better axis titles and an overall title. Overall title, by default, is left-aligned, so I also need to add a command to center it. Plus, the legend is superfluous since you have the x-axis labels. So adding those things in to the original violin plot would look like this:

fantlength+geom_boxplot(width=0.3)+
  labs(title="Page Length by Genre")+
  ylab("Number of Pages")+
  xlab("Genre")+
  theme(legend.position="none",plot.title=element_text(hjust=0.5))

Note you need to add the boxplot command again, since you're referencing the original plot, fantlength. But as you can see, you can add multiple lines with the + sign. Those commands give you this:

Many of these commands are standard and could be used with any kind of ggplot. More on that later!

Still taking requests on stats topics to cover. Let me know in the comments below!

I'm also working on cleaning up my labels and generating multiple labels for different statistics topics. This should make it easier to navigate the various statistics posts on my blog. Stay tuned for that.

Friday, January 19, 2018

The Only Rule is That There Are No Rules

Next week, I'll be participating once again in NYC Midnight Short Story Challenge. I made it to round 2 last year, but really struggled with my 2nd heat assignment - a fantasy story involving scuba diving and a treasured object. I fell into the trap of, "I'm writing fantasy, and fantasy looks like X." I wrote something that I really had no business writing - high fantasy with magical creatures. That's just not my genre. I'll read the hell out of those books. I don't write them.

I was thinking about that heat once again, and of course just now came up with the perfect idea that would have totally been in my wheelhouse. I remember thinking about how much I love whimsical things and wondering why I don't write with more whimsy. I thought about my awful attempt at writing fantasy... and came up with the perfect, whimsical story basically a year too late. One that takes a complete benign and relateable modern situation and adds fantastical elements, whimsy, and comedy to the mix. That's okay, I'm still going to write it. And since it's not for the short story contest, I can toss out those elements from the heat I don't really need.

And on that subject, this wonderful blog post popped up in one of the writer groups I belong to - don't be afraid to change the rules of the genre you're writing. Question yourself when you think things have to a be certain way, such as when I thought fantasy needed to have fairies and elves. Even the situation of seeking a treasured object can be twisted around into something else. Words have many meanings, and some concepts, like treasure, truly are in the eye of the beholder.

When writing, always question. Never take anything at face value. Make it your own, whatever that looks like.

Thursday, January 18, 2018

Statistical Sins: Data Dictionaries and Variable Naming Conventions

Before I started at DANB, the group fielded a large survey involving thousands of participants from throughout the dental profession. My job the last few weeks has been to dig through this enormous dataset, testing some planned hypotheses.

Because the preferred statistical program with my coworkers is SPSS, the data were given to me in an SPSS file. The nice thing about this is that one can easily add descriptive text for each variable, predefine missing values, and label factors. But this can also be a drawback, when the descriptive text is far too long and used to make up for nonintuitive variable names. As is the case with this dataset.

That is, in this dataset, the descriptive text is simply the full item text from the survey, copied and pasted, making for some messy output. Even worse, when the data were pulled into SPSS, each variable was named Q followed by a number. Unfortunately, there are many variables in here that don't align to questions, but they were still named in order. This makes the Q-numbering scheme meaningless. Responses for question 3 in the survey are in the variable, Q5, for instance. Unless you're using descriptive variable names (e.g., data from the question about gender is called "gender"), numbering schemes become unwieldy unless they can be linked to something, such as item number on a survey. It's tempting to skip the step of naming each variable when working with extremely large datasets, but it's when datasets are large that intuitive naming conventions are even more necessary.

I'm on a tight schedule - hence this rushed blog post - so I need to push forward with analysis. (I'm wondering if that's what happened with the last person, too, which would explain the haphazard nature of the dataset.) But I'm seriously considering stopping analysis so I can pull together a clear data dictionary with variable name, shorter descriptive text, and in sample order instead than overall survey order. There are also a bunch of new items the previous analyst generated that don't look all that useful for me and make the dataset even more difficult to work with. At the very least, I'm probably going to pull together an abbreviated dataset that removes these vestigial variables.

Tuesday, January 16, 2018

On the Debate Over Consent and #MeToo

When the #MeToo movement first appeared, and many of my friends and acquaintances were using it to share their stories, I hesitated. Not because I didn't have my own story to tell, but because I wasn't completely ready to tell the world I was a survivor of sexual assault and also because I feared it would turn into a competition. I didn't think it would purposefully be a competition, where people respond to a story with challenges that their own story is so much worse. But I feared that would be implied in some of the responses. I know my story is pretty bad, but obviously, there are others that are much worse. That's not important. What happened to me damaged me - I thought for a long time that it had damaged me beyond repair. I didn't want to feel like I had to justify why my experience was painful, or to explain why I still have nightmares about something that happened so long ago. Something I spent time in therapy and self-defense classes dealing with to help me move on.

But sadly, it's happening. And though my heart goes out to the women who are having to justify why an experience was painful, it's resulting in a debate that we absolutely need - a debate about what consent really involves, and the behaviors that, while not criminal, need to stop.

By now, you've probably read about the allegations against Aziz Ansari and "Grace" (not her real name), the woman who went on a date with Aziz that ended with an upsetting sexual encounter that left her crying in the back of a cab. This story is generating many responses, including an article from The Atlantic that left me very frustrated.

In that Atlantic article, Caitlin Flanagan didn't seem to understand what "Grace" was so upset about. She seems to recognize that things have improved since her own sexual awakening:
I was a teenager in the late 1970s, long past the great awakening (sexual intercourse began in 1963, which was plenty of time for me), but as far away from Girl Power as World War I was from the Tet Offensive. The great girl-shaping institutions, significantly the magazines and advice books and novels that I devoured, were decades away from being handed over to actual girls and young women to write and edit, and they were still filled with the cautionary advice and moralistic codes of the ’50s.
But she falls back into problematic logic and victim blaming:
Those magazines didn’t prepare teenage girls for sports or STEM or huge careers; the kind of world-conquering, taking-numbers strength that is the common language of the most-middle-of-the road cultural products aimed at today’s girls was totally absent. But in one essential aspect they reminded us that we were strong in a way that so many modern girls are weak. They told us over and over again that if a man tried to push you into anything you didn’t want, even just a kiss, you told him flat out you weren’t doing it. If he kept going, you got away from him. You were always to have “mad money” with you: cab fare in case he got “fresh” and then refused to drive you home. They told you to slap him if you had to; they told you to get out of the car and start wailing if you had to. They told you to do whatever it took to stop him from using your body in any way you didn’t want, and under no circumstances to go down without a fight. In so many ways, compared with today’s young women, we were weak; we were being prepared for being wives and mothers, not occupants of the C-Suite. But as far as getting away from a man who was trying to pressure us into sex we didn’t want, we were strong.
Is Caitlin right that "Grace" could have just left? Pushed Aziz away? Slapped him and told him to stop getting fresh? Yes, she's right. "Grace" could have done all those things. But this story, and the various responses I've read, all seem to read this particular instance in black and white: either "Grace" wanted to have sex with Aziz, in which case she should shut about calling this misconduct, or she didn't want to have sex with him, and she should have gotten out of there sooner. But that narrative ignores the fact that not only do both parties need to consent to sex, both parties need to consent to the way it is taking place.

"Grace" admits she made the first move, of sorts. She met Aziz at a party, and though he brushed her off at first, she tried to make a connection with him that eventually worked. She gave him her number and they went out on a date. Yes, she was interested in him. Yes, she might have wanted to have sex with him. In fact, that's probably why she didn't leave or slap him. She wanted to have sex with him, but not in the way he was going about it. So she told him to slow down. She pulled away when he shoved his fingers in her mouth. That strikes me as someone who wants to be with him, but not in that way. How many women have been with men where we've had to say, "Hey, slow down"? How many of us have dealt with things getting a little rougher than we want - hair pulling that becomes a bit too hard, what should have been a playful smack that ends up leaving a handprint - but didn't stop because we feared hurting his feelings? It isn't that we don't want to be with that person. We wanted to stay with them, we just don't want to do those specific things they keeps doing.

The story is spun like her only choices are do what he wants or get the hell out of there. And eventually she did, probably when she realized that trying to gently, and then more forcefully, tell him that she didn't like the specific things he was doing wasn't working. That was the point when she had to decide that the man she was very attracted to, that she had really wanted to go out with, was going to keep using her as an object to act out his porn-inspired fantasies. And by leaving in that way, it meant she was closing the door on ever having a relationship with him. It had to get pretty bad for her to get to that point.

He didn't commit any crimes. He shouldn't go to jail, or lose job opportunities, or be labeled as a sex offender. But this case highlights something we need to discuss: that consent isn't as simple as yes or no.

I'm glad "Grace" shared her story, and I'm sorry that, despite her efforts, her interaction with Aziz was so painful. Rather than dismissing it as "not so bad" and saying it cheapens the stories of "real victims," why can't we use this case to help understand this issue of consent? To really think why our initial reaction to "Grace" is dismissal or frustration? If nothing else, Aziz shows us that, even the most progressive and "woke" members of either gender can have some difficulty navigating and understanding consent. Let's use this instance as a way to explore that, and move on as better people.

On that note, I highly recommend you check out this post, which does a great job at discussing the problem of dismissing as "not so bad."

Book Review: Every Heart a Doorway

You may have noticed I've been posting more book reviews recently. In addition to analyzing data and writing, reading is one of the main ways I spend my time, down- or otherwise. This year, I've challenged myself once again to read 48 books. I'm already 4 books in and it's only January 16. I've also had some pesky insomnia recently, so reading is one of the things I do when I can't sleep. But I've mostly been reading short books so far, so I know that rate will decrease once I get to some of the much longer books on my list.

Which brings us to the book I just finished last night, Every Heart a Doorway by Seanan McGuire. Another instance of young adult fantasy, which I discovered in my first Goodreads data analysis, is the only kind of young adult books I read.


The book takes place at a Home for Wayward Children. While the headmistress, Eleanor West, tells the family of these children that she will help heal their children of their delusions about other magical worlds, in actual fact, her school is a place for children who have stepped into a magic world of some kind and then have to deal with the heartbreak of being expelled from it for one reason or another. That's right, these are the children that ended up in places like Narnia or Oz (though only Narnia is brought up in the book, and dismissed as fiction), after stepping or being pulled through some kind of magical doorway.

Each child is broken in some way and longs to return to their magical world, or at least, longs to deal with the pain of being thrown out. The story starts when Nancy is sent to the school by her parents, after having disappeared into the Land of the Dead and returned much later with a strange tale about where she had been. She is paired with the bubbly Sumi, who spent her time in a High Nonsense world, as her roommate. There, Nancy learns that the magical worlds use a different kind of compass, one where the major coordinates are Nonsense, Logic, Virtue, and Wickedness, in place of North, South, East, and West, and that each child returns from their world changed. She hopes to simply fit in at the school, but then one of her classmates dies. As a former resident of the Land of the Dead, some of the suspicion falls on Nancy. She and her newfound friends try to find out what is happening and who is responsible.

The book (a novella) is a quick read and beautifully written. Seanan's prose is very poetic and thought-provoking at times. As she gets farther into the story, and begins the murder mystery angle, her prose becomes a bit more descriptive and less poetic. I'm not terribly surprised that Every Heart a Doorway has won so many awards. But I have to say, I was a bit disappointed when she made that shift to more descriptive writing, because I was really enjoying the beauty and wordplay.

The story of children being taken to magical worlds suited to their underlying personality traits is an apt metaphor for difficulty "fitting in" experienced by so many children and teenagers. But Seanan also takes on topics teenagers (and adults) struggle through, like sexuality and gender identity directly.

I'll definitely be checking out Seanan's other books, including the two other Wayward Children books.

Monday, January 15, 2018

Book Review: Fun Home

Over the weekend, I read Fun Home by Alison Bechdel. This graphic novel is Alison's memoir, describing her upbringing in rural Pennsylvania, her family but particularly her relationship with her closeted father, Bruce, and her own coming out.


The story is nonlinear, with each chapter exploring different themes while often covering large spans of time. But much of the focus is on the fact that (and this is stated very early in the book, so it's not completely a spoiler) Alison's father died after stepping in front of a truck (which she concludes was purposeful) not long after Alison came out to her family, learned her father had had affairs with men (and boys), and introduced the family to her girlfriend, Joan.

Alison feels responsible for her father's death, that he killed himself not because his daughter is a lesbian, but because she was able to do something he never had the courage to do: come out. Instead, he focuses his energy on restoring his Victorian home, reading and discussing literature, and taking his frustrations and anger that he isn't living the life he wants out on his family. The name of the story, Fun Home, not only references the nickname given to the family business (a funeral home) but also ironically refers to the atmosphere Bruce created ruling over the house like a tyrant.

The story was adapted into a musical, which I saw last year and absolutely loved. At the performance, I bought a copy of the book, which has been languishing on one of my to-read shelves for too long. It's a quick read, so if you're looking for a great story you can finish over a weekend, I highly recommend it. Because I'd seen the musical first, I couldn't help but compare to the original and decide what I thought of the adaptation. The play really does justice to the graphic novel, hitting most of the high points while maintaining Alison's story that shifts seamlessly back and forth between comedy and tragedy.

But the book really goes beyond the play in drawing multiple parallels with the literature Bruce loved: the story of the Minotaur and the labyrinth, which Alison compares to the labyrinthine home Bruce creates, to hide himself within; James Joyces's Ulysses; Oscar Wilde's The Importance of Being Earnest; and the writings of F. Scott Fitzgerald, Henry James, Albert Camus (particularly A Happy Death and The Myth of Sisyphus), William Shakespeare, and Marcel Proust. She also makes various pop culture references, such as It's a Wonderful Life. In fact, the way she could weave together tragedy and comedy has many parallels to Shakespearean plays, where the death of the king (in this case, Bruce the tyrant) is followed with a comedic scene. There were so many wonderful metaphors and themes that couldn't really be easily adapted into the play. Also, the play completely leaves out Alison's struggles with obsessive-compulsive disorder, which explains some of her quirks and behavioral patterns that were in the play.

Though the play doesn't really develop many of the characters beyond Alison and Bruce, the book gives a little more attention and development to the other characters (but still not much, relatively speaking). I'd love to hear Alison explore her relationships with her mother and siblings, though it would make more sense to do that in different work(s). And to be fair, she may have already done this; I just need to explore her work more. Prior to learning about Fun Home, my awareness of Alison Bechdel was in the form of the Bechdel test (which I've blogged about here, and in film reviews here and here).

Overall, a quick read that will make you think, laugh, and cry. As someone who has experienced a family suicide, some of those themes were particularly difficult for me, so keep that in mind if that's also a potential trigger for you.

Sunday, January 14, 2018

Statistics Sunday: Boxplots

A few Sundays ago, I wrote a post explaining quantiles. Though you can define those quantiles however you want, a common way to describe data is with quartiles. We've already talked about one of those quartiles - the 50th percentile or the median, which is the value that divides your distribution in half. To get the remaining two quartiles, 25th and 75th percentiles, you would just divide each half of the distribution in half again.

For instance, I had R randomly generate 10 values:

25.5, 17.7, 21.5, 22.5, 21.6, 19.6, 26.0, 19.6, 28.1, 19.7

If I wanted to find the quartiles of these values, the first thing I'd need to do is put them in order:

17.7, 19.6, 19.6, 19.7, 21.5, 21.6, 22.5, 25.5, 26.0, 28.1

The median is the middle value. Since there are 10 values, the middle is actually in between two numbers. I would take the average of the 5th and 6th number to find the median: (21.5 + 21.6)/2 = 21.55 (or rounded off, 21.6). Then we would find the other two quartiles by taking the median of the two halves. The middle value of the first half, the third value, is 19.6. The middle value of the second half, the eighth value, is 25.5. These three quartiles are bound by the minimum value, 17.7 and the maximum value, 28.1.

So this gives me information I can use to summarize my distribution for you.

Min: 17.7
Q1: 19.6
Q2: 21.6
Q3: 25.5
Max: 28.1

But I could instead present these results visually, using a boxplot, also known as a box and whiskers plot. The boxplot, developed by mathematician John Tukey (who I plan on profiling at some point in the near future), presents the 5 values above. The box contains the 3 quartiles - one side of the box is Q1, the other side is Q3, and a bold line in the middle represents Q2. Two whiskers give the minimum and maximum values. Boxplots are incredibly easy to generate in R.

Let's create the 10 randomly generated numbers from above. I set the seed so you can create the same data on your own:

set.seed(42)
quartiles<- data.frame(score=rnorm(10,20,4))
quartiles

     score
1 25.48383
2 17.74121
3 21.45251
4 22.53145
5 21.61707
6 19.57550
7 26.04609
8 19.62136
9 28.07369
10 19.74914


R has a built-in boxplot function:

boxplot(quartiles$score)

This plot displays all of the summary information from above. I would probably change the axis so the plot is a bit more understandable.

boxplot(quartiles$score, ylim=c(15,30))

Which gives me this:

One modification of the boxplot is to truncate the whiskers. This is really useful when you have a skewed distribution. The page length data from my Goodreads dataset (data on 40 readers and 692 books, which I pulled together and described in yesterday's post) is one example of a skewed distribution.

hist(friends1$Pages, main="Book Length", xlab="Number of Pages", ylim=c(0,300))

The distribution shows a positive skew: we have outliers in the right tail. A boxplot based on these data would have a long second whisker. A modified boxplot, on the other hand, uses the interquartile range (Q3 - Q1) to determine maximum whisker length (which is usually the interquartile range multiplied by some constant, such as 1.5). Those extreme values would still be displayed, but as separate marks on the plot, rather than part of the boxplot itself. This lets you easily see that these extreme values are outliers and that most of the distribution is at the lower end of page length.

boxplot(Goodreads$Pages)

From the boxplot, we can see that there are a few outliers on the low end as well. Most of the books fall between 100 and 600 pages, with some very short and very long outliers.

Saturday, January 13, 2018

More Reading Data Analysis - This Time with Friends

Last week, I wrote a blog post in which I analyzed my reading habits from 2017. I had so much fun pulling that data together and playing with it that I took things a step farther: I decided to look at my friends' 2017 reading habits as well.

I included all friends on Goodreads who logged at least one book as read in 2017. This gives me data on how many books (and which ones) were read by friends who use Goodreads to log their reading activity. I did not include friends who logged 0 books, because there’s no way of knowing if they 1) did not read at all in 2017 or 2) did not log books they read or logged books without adding a read date. This resulted in a dataset of 40 friends and a total of 692 books.

Other things I should note about the data:
  • The dataset isn’t as complete as the one I analyzed for myself; this one includes book title, author, page length, and indicators of which reader(s) logged that book. I didn’t include start/read dates, genres, or rating data. (I originally thought about including ratings, but there was surprisingly little overlap among my friends in books read, so that limited analysis options. I may still pull in genre, though.)
  • Goodreads only gave me first author in my data pull. There are definitely books in the dataset that have multiple authors, but for the sake of simplicity, all author analyses were performed on first author only. Once again, I can pull in these data later if they end up being useful.
  • When I looked at page counts for readers, I noticed a few very long books, so I examined these books to make sure they were not box sets logged as single books. Most were simply very long books, but two instances were in fact multiple books; one case was a 5000+ page book that was actually a 22-book eBook compilation. For these two cases, I updated book read counts and page numbers to reflect the number of actual books, resulting in a different number of books read in the dataset than would be displayed for the person on Goodreads. But this was important when I started doing analysis on page lengths – my histograms and box plots were shrunk on one side to make up for extreme outliers that were not actually reflective of real book length.
  • To track individual readers, I used reader initials, which I then converted into a numeric code to protect reader identity. Should anyone express an interest in playing with this dataset, I’d be able to share it with no identifying information included.
  • A few friends logged audiobooks, which have strange page counts. (For instance, a 1.5 hour audiobook came in at 10 pages! 1.5 hours isn't a long book, but it's certainly longer than 10 pages.) If I could find a print copy of the audiobook either on Goodreads or Amazon, I used that page count. But that left 5 books without real page counts. Information I found online suggested audiobooks are approximately 9300 words per hour, and that a printed book has about 300 words per page. So I used the following conversion: (audiobook length in hours * 9300)/300. This is a gross approximation, but since it only affected 5 books out of 692, I’m okay with it.
Findings

The 9 most popular books in my dataset
Number of books read by a single reader in 2017 ranged from 1 to 190, with an average of 18.5 books. But the mean isn’t a good indicator here. As you can see in the plot below, this is a highly skewed distribution. Almost 28% (n=11) of my friends logged 1 book in 2017 (and this is the mode of the distribution); only 10% (n=4) read more than 50 books, and all but 1 person read fewer than 100 books. The median was 7 books.


The barplot is easier to read without this outlier:


For the most part, each reader was unique in the books he or she read: 94.5% (or 654) books were unique to a single reader, and about 4.1% (29 books) were read by 2 readers in the dataset. That left 9 books (1.3%) read by between 3 and 6 people, which I display in the graphic above. As I mentioned in that previous reading post, the most popular book was The Handmaid’s Tale, read by 6 people. The remaining books were A Man Called Ove, 4 people, and each of the following with 3 people: Dark Matter by Blake Crouch, Harry Potter and the Half-Blood Prince and Harry Potter and the Deathly Hollows both by J.K. Rowling, Into the Water by Paula Hawkins (which won Best Mystery & Thriller in the Goodreads Awards), Thirteen Reasons Why by Jay Asher, Turtles All the Way Down by John Green (which ranked #20 in Amazon's Top 100 list), and Wonder by J.C. Palacio.

True, my dataset probably won’t generalize beyond my friend group, but the popular books match up really well with Amazon’s This Year in Books analysis, which showed The Handmaid’s Tale was the most read fiction book.

The second most popular book on Amazon’s list, It, was read by 2 people in my dataset. Oh, and speaking of Stephen King, he was the most popular author in my data, contributing 13 books read by 8 readers.


The second most popular was Neil Gaiman, with 11 books across 5 readers.


And in fact, going back to my previously noted flaw, that I only analyzed first author, both of these popular authors had 1 book with a coauthor. Sleeping Beauties (winner of Best Horror in Goodread's awards), which is included in the Stephen King's graphic above (because he's first author) wrote that book with his son, Owen. And Neil Gaiman should have 1 additional book in his graphic: Good Omens: The Nice and Accurate Prophecies of Agnes Nutter, Witch. That book was cowritten with Terry Pratchett, who was first author and thus the only one who got "credit" for the book in my dataset. The addition of that book would increase Neil's contribution to 12 books, but would have no effect on number of unique readers, or his rank in terms of popularity.

But I should note that these two were most popular based on number of books + number of readers. If I only went off number of books in the dataset, they would just break the top 5. Based on sheer number of books, Erin Hunter was most popular with 22 books and Victoria Thompson was second with 19 books. Lee Child came in third with 14 books.

The cool thing about those particular results? They came from individual readers. One person read those 22 Erin Hunter books, a different person read the 19 Victoria Thompson books, and a third person read the 14 Lee Child books. (In total, these 3 friends read 345, or 49.9%, of the books included in the dataset.) In fact, it was cool to see the fandom of my different Goodreads friends.

I'll present some more work from this dataset tomorrow, for Statistics Sunday, when I'll be demonstrating the boxplot. So stay tuned for more results from this dataset!

Friday, January 12, 2018

Book Review: The Ghost: A Cultural History

Through pictures, poems, ballads and stories we are able to watch the British ghost as it changes from violent revenant to suffering soul, from admonitory corpse to demonic illusion, from moral teacher to visual spectacle, from historical ambassador to translucent spectre, from fearsome presence to reassuring historical spirit. In whatever ways ghosts choose to manifest themselves in reality, it is in the imagination that they have the most varied and revealing existence. It is here that ghosts hold up a mirror to us, one that reveals our desires and fears.
The Ghost: A Cultural History by Susan Owens, p. 269

I just finished tearing through The Ghost. (I blogged about this book shortly after I first learned about it, so it's great to follow-up with a review.) Though Owens first introduces us to the subject of ghosts with one of the most famous examples in literature - Marley's Ghost in A Christmas Carol - she quickly takes us back to the year 1014, with a story of an attack on (and death of) the Danish conqueror Sweyn Forkbeard by East Anglican king, St. Edmund, who died in 869. From there, Owens takes us through the history of ghosts: how they are portrayed and thought of in both fictional and nonfictional stories.

In the past, ghosts dolled out punishment to wrong-doers, serving as a sort of karmic balance. In some regards, that isn't too different from our conception of ghosts in popular culture - people who died through some injustice or with important business unfinished. But the way ghosts could manifest themselves changed drastically over the centuries. Prior to the Protestant Reformation, ghosts were often souls that had escaped (temporarily) from purgatory, and were asking anyone they encountered to pray for them so they could be released. The Reformation changed that by doing away with the idea that souls went to purgatory, rather than directly to heaven or hell. Early ghosts were also basically zombies - they were corporeal and solid and could be/were mistaken for the living at times.

During this time, attire of ghosts also changed, from death shrouds and white billowy fabric (sheets, anyone?), to clothing they were seen wearing when they were alive.

Ghosts were also inextricably linked to technology. Ghosts went from being solid and corporeal to ethereal and translucent around the time entertainers figured out how to put on light shows that projected reflected images of actors - images that were ethereal and translucent. Painting also influenced this conception, with the switch from oil paint to water colors.

As technology advanced, giving us telegraphs and electricity, ghosts also changed form. Authors imagined telegraphs and phone calls from ghosts, and today's ghost hunters insist that ghosts give off electromagnetic fields that can be picked up with EM monitors. They look for ghostly messages in television static and EVP. Over and over, the book challenged things I had even accepted as standard folklore. In fact, the tendency for horror movies to punish people who disobey social norms comes from the influence of Victorian ghost stories - a time period when stories were easily obtained and thus very popular, and a period of supernatural tales that was revisited beginning in the 1950s.

But at the heart of the book is why we find ghosts so fascinating and enduring. Though our belief in other supernatural beings, like vampires, evil witches who sold their souls to the devil, and demonic possession as an explanation for physical illnesses, have thankfully changed over the centuries, one could argue that ghosts are as much an enduring belief today as they were then. I don't want to give away too much, since I highly recommend you check out this book for yourself, but the big takeaway I got from the book is that ghosts are an enduring belief because the past is always there - not just in our memories but as a physical presence, in the form of old houses, ancient structures, and dark forests. Every person who has ever lived is still on this earth in some form. And people leave a mark on the places they've been and the people they interact with.

My one complaint is that the book was a bit slow at the beginning, or rather, I was slow to get into it when I first started. It felt as though she jumped from her own narrative to quotes far too often, and I wanted to hear more about her own thoughts and analysis. But once I got about 10-25% of the way through, I found I had fallen into a nice rhythm and had come to appreciate her blending of literary quotes with analysis and commentary. So if you also find it to be a slow read at first, try to push through and I think you'll enjoy it as much as I did.

How Did I Get Here?: From the Desk of a Psychometrician

Psychometrics refers to an area of statistics focused on developing, validating, and standardizing tests, though it can also refer to a methodological process of creating tests, beginning with identifying the exact construct to be measured, all the way up to continuing revision and standardization of test items. As institutions become more and more data-driven, many are looking to hire psychometricians to serve as their measurement experts. This is an in-demand field. And psychometrics is still a rather small field, so there aren’t a lot of people with this training.

For this post, I focused on my own journey – since the only experience I know well is my own – but I plan to have future posts discussing other tracks colleagues (both past and present) have taken. My hope is to give you, the reader, some guidance on entering this field from a variety of starting places.

To tell you more about myself, my journey toward becoming a psychometrician could probably begin as early as 2002-2003, my junior year in college, when I first encountered the field of psychometrics. But it wasn’t until 2014 that I really aimed at becoming a psychometrician, through post-doctoral training. And though I’d been calling myself a psychometrician since at least 2014, I took my first job with the title “psychometrician” in 2016.

Of course, it may have begun even earlier than that, when as a teenager, I was fascinated with the idea of measurement – measures of cognitive ability, personality, psychological diagnoses, and so on. I suppose I was working my way toward becoming a psychometrician my whole life; I just didn’t know it until a few years ago.

But I may ask myself:

Undergrad

I majored in Psychology, earning a BS. My undergrad had a two-semester combined research methods and statistics course with a lab. While I think that really set the foundation for being a strong statistician and methodologist today, I also recognize that it served to weed out many of my classmates. When I started in Research Methods I my sophomore year, I had over 30 classmates – all psychology majors. By Research Methods II, we were probably half that number. And in 2004, I graduated as 1 of 5 psychology majors.

During undergrad, I had two experiences that helped push me toward the field of psychometrics, both during my junior year. I completed an undergraduate research project – my major required either a research track (which I selected), involving an independent project, or a practitioner track that involved an internship. This project gave me some hands-on experience with collecting, analyzing, and writing about data, and is where I first learned about exploratory factor analysis, thanks to the faculty sponsor of my research. And that year, I also took a course in Tests & Measures, where I started learning about the types of measures I would be working on later as a psychometrician.

At this point in my career, I wasn’t sure I wanted to go into research full-time, instead thinking I’d become a clinical psychologist or professor. But I knew I enjoyed collecting and (especially) analyzing data, and this is when I first started spending a lot of time in SPSS.

Spoiler alert: I use SPSS in my current job, but it’s not my favorite and in past jobs, I've spent much more time in R and Stata. But it’s good to get exposed to real statistical software1 and build up a comfort level with it so you can branch out to others. And getting comfortable with statistical software really prepared me for…

Grad School

After undergrad, I started my masters then PhD in Social Psychology. I had applied to clinical psychology programs as well, but my research interests and experience (and lack of clinical internship) made me a better fit with social psychology. My goal at that time was to become a professor. Since I had a graduate assistantship, I got started on some research projects and began serving as a teacher assistant for introductory psychology and statistics. This is where I began to fall in love with statistics, and where I first had the opportunity to teach statistics to others. Many students struggled with statistics, and would visit me in office hours. Working to figure out how to fix misunderstandings made me learn the material better, and turning around and teaching it to others also improved my understanding.

I took basically every statistics course I could fit into my schedule, and tacked on a few through self-designed candidacy exams, workshops at conferences, and self-learning for the fun of it: multivariate statistics, structural equation modeling (including an intermediate/advanced 3-day workshop paid for with monetary gifts from Christmas and birthday), meta-analysis, power analysis, longitudinal analysis, and so on and so on. Surprisingly, I didn't delve into modern test theory and I can't remember if I'd even heard of item response theory or Rasch at this point, but I was only looking around the psychology scene. Those approaches hadn’t reached a tipping point in psychology yet and, honestly I’m not sure they’ve reached it yet even now, but we’re much closer than we were a decade ago.

I also became more exposed to qualitative research methods, like focus groups and interviews, and how to analyze that kind of data, first through a grant-funded position shortly after I finished my masters degree, and then, during my last year of graduate school, as a research assistant with the Department of Veterans Affairs. (By that point, I’d given up on being a professor and decided research was the way I wanted to go.) And that position as a research assistant led directly to…

Post Doc #1

Post-doc #1 was part of a program to train health services researchers. But I had some freedom in designing it, and devoted my time to beefing up my methods skills, becoming someone who could call herself an experienced qualitative (and mixed methods) researcher. I trained people on how to conduct focus groups. I coordinated data collection and transcription of hundreds of focus groups and interviews over multiple studies. I analyzed a ton of qualitative data and got more comfortable with NVivo. I brought innovative qualitative research methods to VA projects. And I published a lot.

A bit before my last year of post-doc #1, I became involved with a measurement development project using Rasch. I was brought on board for my qualitative expertise: to conduct and analyze focus groups for the first step in developing the measure. I was intrigued. I knew all about classical test theory, with its factor analysis and overall reliability. But this approach was different. It gave item-level data. It converted raw scores, which it considered ordinal (something I’d never thought about), into continuous variables. It could be used to create computer adaptive tests and generate scores for two people that are on the same metric even if the two people had completely different items. It was like magic.

But better than magic – it was a new approach to statistics. And as with statistical approaches I’d encountered before that, I wanted to learn it.

Fortunately, the Rasch expert for the study was a friend (and a colleague who thought highly of me and my abilities), and she took the time to sit down and explain to me how everything worked. She let me shadow her on some of the psychometric work. She referred me to free webinars that introduced the concepts. She showed me output from our study data and had me walk through how to interpret it. And most of all, she encouraged me to learn Rasch.

So I wrote a grant application to receive…

Post Doc #2

That’s right, I did 2. Because I’m either a masochist or bat-s*** crazy.

But this time, I had a clear goal I knew I could achieve: the goal of becoming a full-fledged psychometrician. As such, this post-doc was much more focused than post-doc #1 (and was also shorter in length). I proposed an education plan that included courses in item response theory and Rasch, as well as a variety of workshops (including some held by the Rehabilitation Institute of Chicago and the PROMIS project). I also proposed to work with a couple of experts in psychometrics, and spent time working on research to develop new measures. I already had the qualitative research knowledge to conduct the focus groups and interviews for the early stages of measure development, and I could pull from the previous psychometric study I worked on to get guidance on the best methods to use. In addition to my own projects, I was tapped by a few other researchers to help them with some of their psychometric projects.

Unfortunately, the funding situation in VA was changing. I left VA not long after that post-doc ended, in part because of difficulty with funding, but mainly, I wanted to be a full-time psychometrician, focusing my time and energy on developing and analyzing tests.

Fortunately, I learned about an opportunity as…

Psychometrician at Houghton Mifflin Harcourt

When I first looked at the job ad, I didn’t think I was qualified, so I held off on applying. I think I was also unsure about whether a private corporation would be a good fit for me, who had been in higher education and government for most of my career at that point. But a friend talked me into it. What I learned after the interview is that 1) I am qualified and that self-doubt was just silly (story of my life, man), and 2) the things I didn’t know could be learned and my new boss (and the department) was very interested in candidates who could learn and grow.

Fortunately, I had many opportunities to grow at HMH. In addition to conducting analyses I was very familiar with – Rasch measurement model statistics, item anchoring, reliability calculations, and structural equation modeling and other classical test theory approaches – I also learned new approaches – classical item analysis, age and grade norming, quantile regression, and anchoring items tested in 2 languages. I also learned more about adaptive testing.

A few months after I started at HMH, they hired a new CEO, who opted to take the company in a different direction. In the months that followed, many positions were eliminated, including my own. So I got a 2 month vacation, in which I took some courses and kept working on building my statistics skills.

Luckily, a colleague referred me to…

Manager of Exam Development at Dental Assisting National Board

DANB is a non-profit organization, which serves as an independent national certification board for dental assistants. We have national exams as well as certain state exams we oversee, for dental assistants to become certified in areas related to the job tasks they perform: radiation safety (since they take x-rays), infection control, assisting the dentist with general chairside activities, and so on. My job includes content validation studies, which involves subject matter experts and current dental assistants to help identify what topics should be tested on, and standard setting studies, once again involving subject matter experts to determine pass-points. I just wrapped up my first DANB content validation study and also got some additional Rasch training, this time in Facets.

Going forward, I’m planning on sharing more posts about the kinds of things I’m working on.

For the time being, I’ll say it isn’t strictly necessary to beef up on methods and stats to be a psychometrician – I’m an anomaly in my current organization for the amount of classical stats training I have. And in fact, psychometrics, especially with modern approaches like item response theory and Rasch, is so applied, there are many psychometricians (especially those who come at it from one of those applied fields, like education or healthcare) without a lot of statistics knowledge. It helps (or at least, I think it does) but it isn’t really necessary.


That's what brought me here! Stay tuned for more stories from fellow psychometricians in the future. In the meantime, what questions do you have (for me or other psychometricians)? What would you like to learn or know more about? And if you're also a psychometrician (or in a related field), would you be willing to share your journey?


1Excel, while a great tool for some data management, is not real statistical software. It's not unusual for me to pull a dataset together in Excel, then import that data into the software I'll be using for analysis.