Thursday, December 14, 2017

Statistical Sins: Not Double-Checking Results

In a previous Statistical Sins post, I talked about the importance of knowing one's variables. Knowing the range and source of your variables is necessary to make sure you're using the correct variables in your results. This is an important step in quality control, and really should be done first, prior to running analyses.

But good quality control shouldn't stop there. Results should be double-checked, and compared to each other, to make sure it all makes sense. This sounds painfully obvious, but unfortunately, this step is skipped too often. For instance, check out the results of the Verge technology survey, and specifically one of the glaring issues pointed out by Kaiser Fung on Junk Charts:
Last time, I discussed one of the stacked bar charts about how much users like or dislike specific brands such as Facebook and Twitter. Today, I look at the very first chart in the article.

This chart supposedly says users trust Amazon the most among those technology brands, just about the same level as customers trust their bank.

The problems of this chart jump out if we place it side by side with the chart I discussed last time.

Now, the two charts use different data - the first chart is a "trust" rating scale, while the second is a "like" rating scale. But notice that in the first chart, yellow is said to stand for "No opinion or don't use," while in the second chart, that category is reflected in gray. It seems highly unlikely that people have an opinion on liking something but not trusting that same institution. The two scales would likely be highly correlated with each other. Also, the chart on the left is missing the "somewhat" category, making the rating scale asymmetrical.

What probably happened is that the "no opinion" category was inadvertently dropped from the chart on the left, a mistake that should (could) have been noticed with a thorough review of the results.

I remember getting ready for a presentation once, and going over my slides when I noticed my standard deviations made no sense - they were too small. Cue a mini-panic attack, since I was presenting in 15 or so minutes at that point. I pulled out the printout of my results and noticed I'd accidentally used standard error instead of standard deviation. Fortunately, the room I was presenting in was not being used, and I was able to use the computer to pull up my file and change the values in tables.

When I first started working as a psychometrician, I was introduced to a very involved process of quality control - including having two people start with the same raw data, and going through the whole process of cleaning, creating new variables, and analyzing results, preferably with different analysis programs. Since R was my program of choice, I would usually use that, while my counterpart in quality control would often use SAS or SPSS.

Mistakes happen. This is one reason we have errata published in journals. And online articles can be easily corrected. The Verge would probably do well to fix some of these mistakes.

No comments:

Post a Comment