Wednesday, August 9, 2017

Statistical Sins: Women in Tech (Here It Goes Again)

Two days ago, a Google employee was fired for writing a memo explaining why the gender disparity in tech was nothing to worry about and we should all just go about our business where the men fix the stuff and women fix the people.

Man, this sounds familiar. Could it... oh, hell, this nonsense again. And in fact, Dr. Lee Jussim, social psychology's stereotype accepter, makes an appearance in this memo, for his work stating that stereotypes are created because they're true.

The memo reads like a college student persuasive paper that was written in an energy drink-fueled binge, in the middle of the night when the library was closed and that's his excuse for the fact that his only search strategy was Google and Wikipedia, even though he would have done that anyway. But, hey, he gave us a TL;DR section. Wasn't that nice of him?:
  • Google’s political bias has equated the freedom from offense with psychological safety, but shaming into silence is the antithesis of psychological safety.
  • This silencing has created an ideological echo chamber where some ideas are too sacred to be honestly discussed.
  • The lack of discussion fosters the most extreme and authoritarian elements of this ideology.
    • Extreme: all disparities in representation are due to oppression
    • Authoritarian: we should discriminate to correct for this oppression
  • Differences in distributions of traits between men and women may in part explain why we don't have 50% representation of women in tech and leadership.
  • Discrimination to reach equal representation is unfair, divisive, and bad for business.
Let me explain. No, there is too much. Let me sum up. Dude bro is upset because his dude bro friends in tech can't take all the jobs at Google because of these diversity programs that try to get more women and minorities into tech. Dude bro is mad because these diversity programs include mentorship programs and club meetings that are for women and minorities trying to get into tech and he doesn't like being excluded from things. He thinks sex differences are universal across cultures (they're not), often have clear biological causes (rarely do they have clear anything causes), and are highly heritable (again, they're not). Oh, and he repeatedly conflates personality differences with differences in occupational interests. Yes, personality relates to these things, but moderately at best.

His arguments are convoluted and at times, contradictory. For instance, he claims diversity programs over-emphasize empathy, and that over-dependence on empathy causes us to favor individuals similar to ourselves. Wait, so doesn't that mean empathy-driven programs would lead the male-dominated tech world to favor other men? Hmm. He also says that while we're working to fix the gender disparity in tech, we would never feel the need to fix the gender disparity in prisons, homelessness, and school dropouts. (Oh, come on.)

And he uses overlaid normal curves to make some point, but I'm at a loss to figure out what that point is.

The main scientific (i.e., not Wikipedia) source he cites is an article by Schmitt, Realo, Voracek, and Allik (2008) (full text here), which is a rather impressive large-scale survey of the Big Five personality traits in 55 countries. Okay, I'll bite, but as I said before, the relationship between personality and occupational interests is weak to moderate. They expressed their results - difference between women and men - in Cohen's d, which, as you may recall, are standardized mean differences: the mean of women minus the mean of men divided by the standard deviation. In this case, the result is positive if women have a higher mean and negative if men have a higher mean.

Overall, they found the following Cohen's ds:
  • Neuroticism: 0.40
  • Extraversion: 0.10
  • Openness to Experience: -0.05
  • Agreeableness: 0.15
  • Conscientiousness: 0.12
They don't give standard deviations for each measure, but give an overall mean of 8.99. Let's just round that up to 9 because it doesn't make much difference. They essentially found that average scores for these 5 scales differ by, respectively, approximately 4 points, a little less than 1 point, less than 1 point, slightly more than 1 point, and slightly more than 1 point. Sure, these differences are statistically significant, but are they practically significant? These differences could have been created by different answers to a single question, a question that may have had biased wording. The study is impressive in that every participant was given the exact same measure (though, of course, it was translated to the native language), but that means that a biased question existed in all 55 samples. Had this been a meta-analysis across different cultures, we wouldn't have that methodological concern.

If we only look at the United States data, the results aren't much different (and before you say, "Hey, universal across cultures," keep in mind that participants from the US made up 16% of the sample and the US sample was over twice as large as the next largest sample):
  • Neuroticism: 0.53
  • Extraversion: 0.15
  • Openness to Experience: -0.22
  • Agreeableness: 0.19
  • Conscientiousness: 0.20
The average SD is 8.49, so to make this painfully clear, these 5 scales differ by, respectively, 4.5 points, 1.3 points, 1.9 points, 1.6 points, and 1.7 points. Even the authors themselves state that these differences are weak except for neuroticism, which they call moderate. 

These are probably the most interesting results, mainly because the authors go on to do an analysis in which they average together 4 of the Big 5 traits and run correlations with a measure of general sex differences. Can you tell me what is measured by an average of scores on Neuroticism, Extraversion, Agreeableness, and Conscientiousness? Because I sure can't. This is a major violation of scale unidimensionality (that a scale should only assess one clearly defined construct). Not to mention, the resulting score is meaningless. So the rest of the paper discusses statistical analysis that violates many key assumptions of scales.

So the strongest piece of evidence this Google employee has is a study that found, at best, moderate sex differences in Neuroticism. I'm not sure he should be hanging his women-can't-handle-tech hat on this.


  1. Well done! And you get bonus points for the use of the Princess Bride quote. =)

    1. Thanks! I'll always take bonus points. Glad you liked the post!