Sunday, January 7, 2018

Statistics Sunday: On Birthdays, and One-Tailed and Two-Tailed Tests

It's my birthday today! I spent last night celebrating with friends. Today I'll be relaxing, reading, and writing, plus dinner at one of my favorite restaurants. Obviously, birthdays have some important implications:


For today's Statistics Sunday post, I wanted to revisit a topic I touched on briefly in my post on null and alternative hypotheses, using our silly hypothesis above as an example. We could update this hypothesis to be something about having more birthday parties, but obviously, there would still be a third variable issue here.

Whenever you set out to conduct statistical analyses, you have hypotheses you want to test. While I would argue that you should focus entirely on your research hypotheses, rather than actively writing out null and alternative hypotheses, it is a good thing to remember what it is you're testing and how you will know you have the expected effect. And that's one way your statistical hypotheses (null and alternative) can help guide you.

Those statistical hypotheses, among other things, allow you to derive your critical value - the cut-off between significant and non-significant. This is based on your selected alpha. When you're testing for a difference, one that could go in either direction, you divide your alpha into 2. In this case, your cut-off in your normal distribution would be in both tails.

But if I wanted to test the hypothesis above, that people who have had more birthdays live longer, it doesn't make sense to test for a difference. It makes sense to test in a certain direction: people who have had more birthday parties live longer. In this case, you can maximize the chance that you'll find a significant effect by putting all of your alpha in one tail.

If you did a really simple study, where you recruited people who had never had a birthday party and people who celebrated their birthdays, then tracked how long they lived, you would set your cut-off value so that all of your alpha is in the upper tail of your distribution - and you would only reject if your birthday partiers were significantly older at their time of death than your birthday non-partiers. Your cut-off value will be lower as a result, making it easier to reject the null hypothesis, but you also only get to reject the null hypothesis if the difference is in the expected direction (that is, the partiers were significantly older rather than significantly younger).

This concept is taught in statistics courses, and students are often tested on whether they understand one-tailed versus two-tailed tests. The thing is, I rarely see this done in journal articles. Not every statistical analysis allows for one- versus two-tailed tests, and it wouldn't make sense to do this for more complex analyses. But it's surprising that when people are doing simple t-tests, which I do see in journal articles, there doesn't seem to be any consideration of directionality of hypotheses.

What do you think? Have you encountered one- and two-tailed tests outside of statistics courses?

4 comments:

  1. IMO, a one-tailed test should be used only when the other tail is an impossibility. You've a you're/your typo here "remember what it is your testing"

    ReplyDelete
  2. In most applications a one sided hypothesis is only tenable if you can logically discount one of the two possibilities. There are situations where a one sided test is permissible e.g. interim analyses for trial progression or to meet certain statutory requirements. Usually though one sided tests and one sided confidence intervals are not justifiable. Wishful thinking is not a justification.

    ReplyDelete
  3. Why is it relevant, other than to convince the kind of journal editor who thinks this sort of thing is meaningful or important, to "maximize the chance that you'll find a significant effect"? Why not just double my alpha to .10? How, in practice, is that different?

    ReplyDelete
    Replies
    1. I think what I've learned in my exposure to and experience with statistics over the years is that there are always a series of trade-offs one must make. In reducing Type I error, we increase our risk of Type II error, and vice versa. The trade-off with a one-tailed test is that, though you get essentially double the alpha (but not really), it's only for one direction of difference. A difference in the other direction, no matter how large, must be discounted, because you traded direction for double alpha. Either way, your alpha is the same; it's just a question of whether you want the increased power to detect an effect in only one direction, or to be open to the possibility that the direction could go either way. The two commenters above makes excellent points that one-tailed tests are probably best used in situations where only one direction is possible or logical. And obviously, the whole one- versus two-tailed dichotomy doesn't apply to many analyses. So for the most part, tests will be two-tailed, either by choice or necessity.

      Delete