Thursday, April 20, 2017

Q is for Quota Sampling

I'm kind of cheating, because this is more of a methods topic than a statistics topic. But, as I've argued from atop my psychology pedagogy soapbox, the two are very much connected ("and should be taught as a combine course!" I shout from atop my... well, you get the idea). Your methods can introduce bias, increasing the probability of things like Type I error, and the methods you use can also impact what statistical analyses you can/should use.

Here's something you may not realize: nearly every statistical analysis you learn about in an introductory statistics course assumes random sampling, meaning the sample you used in the study had to be randomly selected from the population of interest. In other words, every person in the population you're interested in (who you want to generalize back to) should have an equal probability of being included in the study.

Here's something you probably do realize: many studies are conducted on college students, mainly students currently taking introductory psychology (and thus, mostly freshmen). Further, students are usually given access to a list of studies needing participants and they select the ones to participate in.

See the issue here? We analyze data using statistics meant for random sampling, on studies that used convenience sampling (i.e., not random). In fact, there's even some potential for selection bias since people choose which studies to participate in. There is much disagreement on whether this is a big deal or not. This is why I balk when people act as though statistics and research issues are clear-cut and unanimously agreed upon.

In fact, true random sampling is pretty much impossible. If your study requires people to come into the lab, you can't exactly recruit people at random from around the world, or even more narrowly, around the US. Survey research firms probably come the closest to true random sampling, but even then, there are limitations. Random digit dialing will miss people who don't have a phone (which, true, is very few people) and will have differential probability of being selected if two or more people share a phone. If your population is more narrow than, say, the entire US population, it might be a little more doable to have nearly random sampling, but there's also that pesky issue of consent. You can't force people to participate in your study unless you're the Census Bureau and can threaten them with legal action if they fail to comply. No matter what, you're going to have selection bias.

But fine, let's say we can actually have truly random sampling. We still might not end up with a sample that accurately represents the population. Why? Because probability. (For those playing along at home, that's been the answer to nearly every rhetorical question this month.)

Weird things can happen when you let something be random. Like 10 heads in a row, or snake eyes twice in a row, or a sample of 70% women from a population that is 50% women. Sometimes we have to give probability a hand, so we might stratify our sample, to ensure we have even representation for different characteristics. So if our population is 50% women, we would force our sample to be 50% women.

We select the characteristics that matter to us - usually things like gender, race, ethnicity, socioeconomic status, and so on, but it also depends on what you're studying - and draw our sample to ensure it has essentially the same proportions of these different characteristics as we see in the population. We call this stratified random sampling.

So why is the title of this post quota sampling? As I said, many studies are conducted using convenience samples, especially when random sampling would be costly, time-consuming, and/or impossible. But it might still be important to us to have similar characteristics as the population. So we set quotas.

If I want to make sure my sample is 50% women, I would open up half my slots for women, and when I had as many women as I needed, I would close that portion of the study. Probably the easiest way to accomplish this is with a screening questionnaire or interview. Screening is done to exclude people who don't qualify for the study for some reason (e.g., they had 5 cups of coffee this morning), but it can also be used to enforce quotas. Quota sampling is the non-random counterpart to stratified random sampling.

So if you're using a convenient sample (and let's face, most researchers are), but want it to mirror the characteristics of the population, use quota sampling.


  1. Ah yes, that does make sense. I hadn't thought too much about "random" stats before, but I have wondered how they make it happen and where they get the people from. Thanks for sharing!

    Visiting from the A to Z Challenge. You can see my "Q" post here:

  2. I was one of those guinea pigs. I think being forced to participate is another example of convenience sampling. But, yes, true random sampling is near impossible.

    Researchers are conducting their studies for a reason, so they'll narrow the scope of survey-takers, or study participants to the population they want to mirror (as you said). I didn't know about quota sampling though...or I didn't know it was called that. I'm all the smarter for it now! :D

  3. Interesting. That makes complete sense. And I totally would have never considered statistics and sampling as two different things even though it makes sense that they could be… Anyway, thanks for sharing. Your post made me think. :)

    With Love,