Wednesday, March 7, 2018

Statistical Sins: Gender and Movie Ratings

Though I try to feature my only content/analysis/thoughts in my statistics posts, occasionally, I encounter a really well-done analysis that I'd rather feature instead. So today, for my statistical sins post, I encourage you to check out this excellent analysis from FiveThirtyEight that uncovers what would qualify as a statistical sin. You see, when conducting opinion polling, it's important to correct for discrepancies between the characteristics of a sample versus population, characteristics like gender. But apparently, IMDb ratings also show discrepancies, where men often outnumber women in rating different movies, sometimes as much as 10-to-1. And if you want to put together definitive lists of best movies, you either need to caveat the drastic differences between population and raters, or make it clear that the results are heavily skewed by one gender.
The Academy Awards rightly get criticized for reflecting the preferences of a small, unrepresentative sample of the population, but online ratings have the same problem. Even the vaunted IMDb Top 250 — nominally the best-liked films ever — is worth taking with 250 grains of salt. Women accounted for 52 percent of moviegoers in the U.S. and Canada in 2016, according to the most recent annual study by the Motion Picture Association of America. But on the internet, and on ratings sites, they’re a much smaller percentage.

We’ll start with every film that’s eligible for IMDb’s Top 250 list. A film needs 25,000 ratings from regular IMDb voters to qualify for the list. As of Feb. 14, that was 4,377 titles. Of those movies, only 97 had more ratings from women than men. The other 4,280 films were mostly rated by men, and it wasn’t even close for all but a few films. In 3,942 cases (90 percent of all eligible films), the men outnumbered the women by at least 2-to-1. In 2,212 cases (51 percent), men outnumbered women more than 5-to-1. And in 513 cases (12 percent), the men outnumbered the women by at least 10-to-1.

Looking strictly at IMDb’s weighted average — IMDb adjusts the raw ratings it gets “in order to eliminate and reduce attempts at vote stuffing,” but it does not disclose how — the male skew of raters has a pretty significant effect. In 17 percent of cases, the weighted average of the male and female voters was equal, and in another 26 percent of cases, the votes of the men and women were within 0.1 points of one another. But when there was bigger disagreement — i.e. men and women rated a movie differently by 0.2 points or more, on average — the overall score overwhelmingly broke closer to the men’s rating than the women’s rating. The score was closer to the men’s rating more than 48 percent of the time and closer to the women’s rating less than 9 percent of the time, meaning that when there was disagreement, the male preference won out about 85 percent of the time.

In the article, a table of the top 500 movies (based on weighted data) demonstrates how gender information impacts these rankings - for each movie, the following are provided: what the movie is currently rated, how it would be rated based on women or men only, and how it would be rated when data are weighted to reflect discrepancies in the proportion of men and women. Movies like The Shawshank Redemption (#1) and The Silence of the Lambs (#23) would generally remain mostly unchanged. Movies like Django Unchained (#60) and Harry Potter and the Deathly Hallows: Part 2 (#218) would move up to #34 and #50, respectively, while Seven Samurai (#19) and Braveheart (#75) would move down to #59 and #112, respectively. And finally, movies that never made it on to the top 250 list, like Slumdog Millionaire and The Nightmare Before Christmas, would have rankings of #186 and #199, respectively.

No comments:

Post a Comment