Inspired by collaborator Julia Silge's similar analysis of gender and n-grams in Jane Austen's novels, he used the Wikipedia plots dataset to examine what verb follows the words "he" and "she":
I think this paints a somewhat dark picture of gender roles within typical story plots. Women are more likely to be in the role of victims- “she screams”, “she cries”, or “she pleads.” Men tend to be the aggressor: “he kidnaps” or “he beats”. Not all male-oriented terms are negative- many, like “he saves”/”he rescues” are distinctly positive- but almost all are active rather than receptive.
Women in storylines are not always passive victims: the fact that the verb “stabs” is shifted towards female characters is interesting. What does the shift look like for other words related to violence or crime?
There’s an old stereotype (that’s appeared in works like Game of Thrones and Sherlock Holmes) that “poison is a woman’s weapon”, and this is supported in our analysis. Female characters are more likely to “poison”, “stab”, or “kick”; male characters are more likely to “beat”, “strangle”, or simply “murder” or “kill”. Men are moderately more likely to “steal”, but much more likely to “rob”.As always, he gives you all of his code and encourages readers to do some analyses of their own. Hopefully I'll have a chance to play with some of the data myself.
It’s interesting to compare this to an analysis from the Washington Post of real murders in America. Based on this text analysis, a fictional murderer is about 2.5X as likely to be male than female, but in America (and likely elsewhere) murderers are about 9X more likely to be male than female. This means female murderers may be overrepresented in fiction relative to reality.