Wednesday, March 21, 2018

Statistical Sins: The Myth of Widespread Division

Recently, many people, including myself, have commented on how divided things have become, especially for any topic that is even tangentially political. In fact, I briefly deactivated my Facebook account, and have been spending much less time on Facebook, because of the conflicts I was witnessing among friends and acquaintances. But a recent study of community interactions on Reddit suggests that only a small number of people are responsible for conflicts and attacks:
User-defined communities are an essential component of many web platforms, where users express their ideas, opinions, and share information. However, despite their positive benefits, online communities also have the potential to be breeding grounds for conflict and anti-social behavior.

Here we used 40 months of Reddit comments and posts (from January 2014 to April 2017) to examine cases of intercommunity conflict ('wars' or 'raids'), where members of one Reddit community, called "subreddit", collectively mobilize to participate in or attack another community.

We discovered these conflict events by searching for cases where one community posted a hyperlink to another community, focusing on cases where these hyperlinks were associated with negative sentiment (e.g., "come look at all the idiots in community X") and led to increased antisocial activity in the target community. We analyzed a total of 137,113 cross-links between 36,000 communities.

A small number of communities initiate most conflicts, with 1% of communities initiating 74% of all conflicts. The image above shows a 2-dimensional map of the various Reddit communities. The red nodes/communities in this map initiate a large amount of conflict, and we can see that these conflict intiating nodes are rare and clustered together in certain social regions. These communities attack other communities that are similar in topic but different in point of view.

Conflicts are initiated by active community members but are carried out by less active users. It is usually highly active users that post hyperlinks to target communities, but it is more peripheral users who actually follow these links and particpate in conflicts.

Conflicts are marked by the formation of "echo-chambers", where users in the discussion thread primarily interact with other members of their own community (i.e., "attackers" interact with "attackers" and "defenders" with "defenders").
So even though the conflict may appear to be a widespread problem, it really isn't, at least not on Reddit. Instead, it's only a handful of users (trolls) and communities. Here's the map they reference in their summary:

The researchers will be presenting their results at a conference next month. And they also make all of their code and data available.

  1. A very interesting post, however, the definition of conflict doesn't take into account some of the specific behavioural codes of Reddit. It's highly discouraged on the site to link directly to another subreddit, when doing so it's customary to use an np (no participation) link, which doesn't allow you to up- or down-vote, or comment in the thread. By focusing on individuals who ignore this code of conduct the sample will be biased. As such it's no suprise that 1% of the subreddits initiate 74% of this type of conflict, as these are likely subreddits where the moderators are not enforcing this rule. I suspect a broader operationalization of conflict (for instance, all negatively stated interactions) would lead to a far more equeal distribution of conflict across communities.