Deeply Trivial: Statistical Sins in History: Handling and Understanding Criticism

Today's Statistical Sins will be a little bit different, using an example from history of statistics to talk about an aspect of research publication. I'm currently reading Fisher, Neyman, and the Creation of Classical Statistics by Erich L. Lehmann. I've talked before about Egon Pearson, who was Jerzy Neyman's long-time collaborator. The feud between Neyman and Ronald Fisher is legendary in statistical history, as is the feud between Karl Pearson and Fisher. But not as much attention has been given to the feud between E. Pearson and Fisher. The start of that feud - though arguably mostly caused by Fisher and K. Pearson's ongoing competition of who could be more petty - can probably be traced to a review, authored by E. Pearson, about Fisher's book, Statistical Methods for Research Workers.

The title page from the first edition

The review in question was regarding Fisher's second edition of the book. It was positive overall, except for this (note: this is quoted from Fisher, Neyman, and the Creation of Classical Statistics; I didn't track down the original):

There is one criticism, however, which must be made from the statistical point of view. A large number of the tests developed are based... on the assumption that the population sampled is of the "normal" form. That this is the case may be gathered from a careful reading of the text, but the point is not sufficiently emphasized. It does not appear reasonable to lay stress on the "exactness" of the tests when no means whatever are given of appreciating how rapidly they become inexact as the population sampled diverges from normality... [N]o clear indication of the need for caution in their application is given.

The issues E. Pearson is addressing here are 1) the robustness of a test and 2) determining how far a dataset needs to diverge from normal before it no longer satisfies the requirements of the test. These are legitimate questions, and further, there is a very good reason E. Pearson raised them. But first, the fallout.

Fisher was pissed. He was so pissed he wrote a response to the journal that originally published the review (Nature). We don't know exactly what this letter said, but based on later correspondence, it appears Fisher believed the question of normality was irrelevant to the content of the book (and I'm sure there was some name-calling as well). As often occurs with a letter to the editor regarding a published paper, the editor sent it to E. Pearson and asked if he would like to respond. He wrote his response but before sending it off, showed it to William Sealy Gosset.

Gosset, who had a good working relationship with Fisher, decided to serve as mediator, and wrote a letter to Fisher to try to settle the dispute. Apparently that approach worked, because Fisher decided to withdraw his letter to Nature (which is why we don't know what it said) and suggested Gosset should instead write a letter (on Fisher's behalf) responding to E. Pearson's review. Of course, Fisher did end up writing a response... to Gosset's letter, because Gosset agreed with E. Pearson's comment about normality, saying that, though he believed the Student distribution (which he created) could withstand "small departures from normality," we needed more research into this topic, and in the meantime, experts in statistical distributions (like Fisher) could help guide us on how to respond when our data aren't normal. Gosset knew Fisher was a better mathematician, and likely saw this as a way of asking Fisher for help in answering these questions.

Fisher, instead, brought up the possibility of distribution-free tests.

The thing Fisher never really considered is why E. Pearson was so fixated on this issue of robustness and normality. Do you know what are two of E. Pearson's contributions to the field of statistics? Exploration into determining the best goodness of fit test (that is, the best way to determine if a set of data matches a theoretical distribution, like the normal distribution - part of his collaboration with Neyman) and the concept of robustness. In fact, he was already working on much of this when he wrote that review in 1929.

E. Pearson was not trying to make Fisher look bad or call him dumb. On the contrary: E. Pearson was trying to connect what he was working on to Fisher's work and set the stage for his own contributions. In fact, this is often the reason researchers will criticize another researcher's work in a paper or letter to the editor: they're setting the stage for the contribution they're about to make. They're taking the opportunity to say "we need X," only to turn around and deliver X soon after.

This is done all the time. People even do it in their own papers, when they highlight a certain shortcoming of their research in the discussion section; they're probably highlighting a flaw that they've already figured out how to fix and may already be testing in a new study. (Or they added it to make a reviewer happy.)

Fisher's response was because he couldn't see the reason E. Pearson was criticizing him. He just saw the criticism and went into rage mode. It's easy to do. Hearing criticism sucks. And while, as researchers we frequently have to deal with criticism of our work in dissertation defenses and peer reviews, they are rarely so public as they are with a published book review or letter to the editor.

I'll admit, when I received an email from a journal that someone had written a letter to the editor in response to one of my articles (and asking if I'd like to write a response), I made that sound kids make when they have a skinned knee:

It took some courage to open the file and read the letter. I was amazed to see it was incredibly positive. I can only imagine what my reaction would be if it hadn't been positive.

But if we can take a step back and realize why this researcher might be waging a particular criticism, it might make it a bit easier to handle the hurt feelings. Who knows how different things would have been for the field of statistics if - instead of throwing a tantrum and writing a pissed off letter to the editor - Fisher had written E. Pearson a letter directly saying, "I think this issue of normality is irrelevant to what I was trying to do. Why do you think it's important?" Maybe we would be talking today about the amazing collaboration between E. Pearson and Fisher. (Probably not, but a girl can dream, right?)

Deeply Trivial

Wednesday, August 30, 2017

Statistical Sins in History: Handling and Understanding Criticism

No comments:

Post a Comment