Tuesday, August 29, 2017

Social Learning and Amazon Reviews

In my inbox this morning was a new article from Psychological Science exploring how people use statistical and social information. And a great way to examine that is through Amazon reviews.

Social learning - also called vicarious learning - is when we learn by watching others. One of the famous social learning studies, Bandura's "bobo doll" study found that kids could learn vicariously by watching a recording, showing us that it isn't necessary for the learner to be in the same room as the model. The internet has exponentially increased our access to social information. But Amazon reviews not only provide social information, but numerical information:
One can learn in detail about the outcomes of others’ decisions by reading their reviews and can also learn more generally from average scores. However, making use of this information demands additional skills: notably, the ability to make intuitive statistical inferences from summary data, such as average review scores, and to integrate summary data with prior knowledge about the distribution of review scores across products.
To generate material for their studies, they examined data from 15 million Amazon reviews (15,655,439 reviews of 356,619 products, each with at least 5 reviews, to be exact). They don't provide a lot of detail in the article, instead referring to other sources, one of which is available here, to describe how these data were collected and analyzed. (tl;dr is that they used data mining and machine learning.)

For experiment 1, people had to make 33 forced choices between two products, which were presented along with an average rating and number of reviews. Overall, the most reviewed product had 150 reviews and the least reviewed product had 25, with options fall between those two extremes. An example was shown in the article:

They found that people tended to prefer the product with more reviews more frequently than their statistical model (which factored in both number of reviews and rating) predicted. In short, they were drawn more to the large numbers than to the information the ratings were communicating.

Experiment 2 replicated the first experiment, except this time, they had participants make 25 forced choices, and decreased the spread of number of reviews: the minimum was 6 and the maximum was 26. Once again, people were drawn more to the number of reviews than the ratings. When they pooled results from the two experiments and examined them using meta-analysis techniques, they found that people unaffected by the drastic differences in number of reviews between experiment 1 and experiment 2. As the authors state in their discussion:
In many conditions, participants actually expressed a reliable preference for more-reviewed products even when the larger sample of reviews served to statistically confirm that a poorly rated product was indeed poor.
Obviously, crowd-sourcing information is a good thing, because, as we understand from the law of large numbers, data from a larger sample is expected to more closely reflect the true population value.

The problem is that people fixate on the amount of information and use that heuristic to guide their decision, rather than using what the information is telling them about quality. And there's a point of diminishing returns on sample size and amount of information. A statistic derived from 50 people is likely closer to the true population than a statistic derived from 5 people. But doubling your sample from 50 to 100 doesn't double the accuracy. There comes a point where more is not necessarily better, just, well, more. This is a more complex side of statistical inference, one the average layperson doesn't really get into.

And while we're on the subject of Amazon reviews, there's this hilarious trend where people write joke reviews on Amazon. You can read some of them here.

No comments:

Post a Comment