Saturday, March 26, 2016

What is Social Psychology: A Taste of What's to Come in April

I'm so excited for the April A-Z Blog Challenge, and am thrilled to get comments from new visitors excited about the topic. A couple of you mentioned that you haven't heard the term "social psychology" before (and trust me, you're not alone - probably half of the people who hear my degree is in social psychology are unfamiliar with the term), so I thought now would be a good time to give a brief overview, and let you know some things to expect in April.

Social psychology is a subfield of psychology. In fact, there are many subfields; just to name a few: developmental psychology, cognitive psychology, behavioral psychology, and clinical psychology. All of these subfields are part of the science of how people think, feel, and behave (the overall definition of psychology), but they go about understanding and studying those concepts in different ways.

Social psychology focuses on how people think, feel, and behave in social settings and examines social factors (like other people, groups of people, even societal norms) that impact how people think, feel, and behave. That is, even when you are by yourself, social forces like the media or the groups to which you belong continue to influence you.

Many people ask me how social psychology differs from sociology (an excellent question!). Sociology is about studying very large groups - like race, culture, or country of origin - and the overall behaviors of those large groups. Social psychology, on the other hand, is about individuals and their behavior, and in some cases, small groups, like juries; in fact, jury decision-making is a big topic in social psychology. Sociology and social psychology also use different theories and sometimes different methods to study concepts.

So what are a few key topics in social psychology? As I mentioned above, jury decision-making and the related topic of how small groups make decisions, is a big one. I also mentioned the media as an influence on behavior, which is one of my favorite areas to study in social psychology. In fact, some social psychologists study whether the media encourages aggression or violence. If you've taken introductory psychology, you may remember watching a video about the "Bobo doll" study, a study by Albert Bandura examining whether watching violent media made children more likely to punch a Bobo doll.

Honestly, I'd punch anything that looks like a clown. Clowns are terrifying.

If you didn't take introductory psychology, or don't remember this particular contribution, you're in luck! I'll be talking more about this study and Bandura's work in April!

Another famous social psychological study is sometimes called the "Milgram shock experiment," where Stanley Milgram did what appeared to be, on the surface, a study of learning, in which the teacher shocked the pupil for wrong answers; in actuality, the study was on obedience to authority, because the teacher (the actual participant) was encouraged by the experimenter to shock the pupil (an actor, who was not receiving real shocks) at increasing intensity. If this sounds like a brutal and unethical experiment (and it was, although it wasn't deemed unethical until after the fact), Milgram conducted the study to understand why seemingly good people did horrible things at the behest of a leader (such as what happened during the Holocaust). Once again, more on that study in April!

In both of these cases, someone else is influencing the individual directly: the violent media or the experimenter demanding more shocks. But what about topics where the influence isn't as direct or obvious? For instance, what about the ways in which we associate ourselves with others, to form relationships or attachments? These are also prime topics for social psychologists. One example is the attachments we form with people we haven't met. We have favorite celebrities who we feel as if we "know." Or, if you're like me, you've become so attached to a character from a book or TV show, that you feel genuine emotions about what happens to him/her. We call these "parasocial relationships." This is why we, for example, cry when a character we know and love dies. More on that in April!

Hopefully this post has answered some of your questions about my favorite field. If not, feel free to ask questions in the comments below! I absolutely love getting to tell people more about what I do. And be sure to check back in April!

Monday, March 21, 2016

A to Z April Challenge Theme Reveal

I haven't been blogging as much recently, as I've been spending a lot more time in rehearsals than usual. My choir has a benefit every year featuring cabaret-style performances; I'm performing in three groups this year. It's always a great time. Want to join us this year? You can buy tickets here!

And if you're not sure what to expect, you can check out past performances on YouTube! The 2015 playlist is below (and yes, you can find me in a few of the videos, including videos 1 and 3), followed by links to remaining years:

2014 playlist

2013 playlist

2012 playlist

So I'm probably crazy to add something else to my April, right? This year, I'm participating in the Blogging from A to Z Challenge, which occurs every year in April. This means blogging nearly every day. One way to help make writing 26 posts in a month easier is by picking a theme.

Anyone using a theme was encouraged to reveal their theme today:

And here it is:

The A-Z of Social Psychology

All April, I'll be blogging about some of my favorite topics in social psychology, in alphabetical order. Stay tuned!

Sunday, March 20, 2016

Bulldogs, Wildcats, and Bears: NCAA Bracket Mascot Face-off

I've joked in years past about creating an NCAA bracket based on which mascot I think would win in a fight. I decided to sit down and put one together for the first round, and because I didn't have time to post until now (when round 1 and for some games round 2 is already over), we can see how this bracket did!

South Region

Jayhawks (1) v. Governors (16) - The first matchup is a mythical bird against a politician. In cases where the mascot is made up, I based my assessment of that mascot's abilities based on drawings and logos. You may call me biased, but in this case, I decided that the Jayhawk would win.

Actual outcome: Jayhawks win!

Buffaloes (8) v. Huskies (9) - While anyone who knows me know I'm partial to puppies, I picked buffaloes.

Actual outcome: Nope - huskies win!

Terrapins (5) v. Jackrabbits (12) - So for this matchup, I had to ask an important question: what the heck is a terrapin? After some googling, I discovered it's a turtle.

I hope you laughed as hard as I did when you realized this matchup is between the tortoise and the hare.

We all know how that story turned out. The tortoise and the hare race; the hare, confident in his win, decides to take a nap, but oversleeps and ends up losing. Thus the moral of the story: naps are fine in moderation, but not when you have something important to do.

That's the moral, right?

Anyway, even though the match is a fight instead of a race, I picked the terrapin.

Actual outcome: Terrapins win!

Golden Bears (4) v. Rainbow Warriors (13) - This would likely be a fierce battle between the warrior and the bear, but I think the bear would emerge victorious.

Actual outcome: Nope - Rainbow Warriors win!

Wildcats (6) v. Shockers (11) - Contrary to what I originally thought, the "shocker" is not a bolt of electricity. Rather, it refers to shocking (harvesting) wheat. No offense to farmers, but my money would be on the wildcat.

Actual outcome: Nope - Shockers win!

Hurricanes (3) v. Bulls (14) - Note for this and all subsequent matchups - Mother Nature always wins.

Actual outcome: Hurricanes win!

Hawkeyes (7) v. Owls (10) - A "hawkeye" is not a real thing, just part of a real thing, unless you mean this hawkeye:

The logo, however, is a rather pissed off looking hawk, which would probably defeat an owl in a fight.

Actual outcome: Hawkeyes win!

Wildcats (2) v. Bulldogs (15) - These are two of the most common mascots from the tournament. In fact, even my alma mater were the wildcats. But has anyone ever looked up 'wildcat' before? They're not that impressive:

You might think this guy is snarling, but the caption of the original photo said he was yawning. No, really.
Wildcats are about the size of housecats. While bulldogs aren't a lot bigger, they could easily win in a fight.

Actual outcome: Nope - housewildcats win!

West Region

Ducks (1) v. Crusaders (16) - The crusaders would probably eat the duck for dinner.

Actual outcome: Nope - ducks win!

Hawks (8) v. Bearcats (9) - Contrary to their name, bearcats aren't the perfect combination of bear and big cat. They're more like pandas or raccoons. In which case, the hawk would probably win.

Actual outcome: Hawks win!

Bears (5) v. Bulldogs (12) - Once again, the poor bulldog wouldn't stand a chance.

Actual outcome: Nope - Bulldogs win!

Blue Devils (4) v. Seahawks (13) - Seahawks sound pretty tough, and in just about any other matchup, they'd probably win. But they would probably be outclassed by devils - blue or otherwise.

Actual outcome: Blue Devils win!

Longhorns (6) v. Panthers (11) - Despite their impressive horns...

longhorns are pretty docile. A panther could easily take one out.

Actual outcome: Panthers win!

Aggies (3) v. Phoenix (14) - An "aggie" is actually a shortening of the word "agriculture." So this matchup is between a farmer and a phoenix. Seeing as how the phoenix can't really be killed, it would probably win.

Actual outcome: Nope - Aggies win!

Beavers (7) v. Rams (10) - While beavers could probably do some damage with those long teeth, a ram would probably do more damage.

Actual outcome: Rams win!

Sooners (2) v. Roadrunners (15) - "Sooner" is a nickname for southern settlers, who could probably best a roadrunner, as long as they don't get their products from Acme.

Actual outcome: Sooners win!

East Region

Tar Heels (1) v. Eagles (16) - A "tar heel" is a person from North Carolina. I think an eagle would probably win in a fight, mostly because when I think of "person v. eagle" I think of this:

Actual outcome: Nope - Tar Heels win!

Trojans (8) v. Friars (9) - I'm sure friars could do some damage if they absolutely had to fight, but trojans would probably be better equipped.

Actual outcome: Nope - friars win!

Hoosiers (5) v. Mocs (12) - People from Indiana or shoes. Um, people

Actual outcome: Hoosiers win!

Wildcats (4) v. Seawolves (13) - Just like seahawks, seawolves would probably be pretty tough, especially against a housecat.

Actual outcome: Nope - Wildcats win!

Fighting Irish (6) v. Golden Hurricanes (11) - Once again, my money is on Mother Nature.

Actual outcome: Nope - Fighting Irish win!

Mountaineers (3) v. Lumberjacks (14) - This would be an amazing battle between two nearly-equally-armed foes, but I think the lumberjacks would ultimately win.

Actual outcome: Lumberjacks win!

Badgers (7) v. Panthers (10) - The badger could probably do some damage (at least, if it's a honey badger) but the panther would probably be better at fighting.

Actual outcome: Nope - badgers win!

Musketeers (2) v. Wildcats (15) - Another wildcat matchup. I would choose musketeers.

Actual outcome: Musketeers win!

Midwest Region

Cavaliers (1) v. Pirates (16) - This was a tough one, though not as tough as another match below. I think the cavaliers might be better armed than the pirates in a one-on-one fight.

Actual outcome: Cavaliers win!

Red Raiders (8) v. Bulldogs (9) - I would never condone armed men going after a bulldog. But if it had to happen, I'd bet on the raiders.

Actual outcome: Nope - bulldogs win!

Boilermakers (5) v. Trojans (12) - Though Trojans would probably win other matches, I'm not sure they could take on a guy with a blowtorch.

Actual outcome: Nope - Trojans win!

Cyclones (4) v. Gaels (13) - Not to belabor the point, but my bet's on Mother Nature.

Actual outcome: Cyclones win!

Pirates (6) v. Bulldogs (11) - Again, armed men going after a bulldog. What's with you people?! But once again, I'd bet on the pirates.

Actual outcome: Nope - bulldogs win! I really need to have more faith in the bulldog.

Utes (3) v. Bulldogs (14) - Okay, lesson learned. I pick bulldogs over people of Utah.

Actual outcome: Nope - Utes win!

Flyers (7) v. Orange (10) - I had no clue what to make of this one. If you base this match entirely on mascot appearance, you have a wing versus an anthropomorphized piece of fruit. I seriously flipped a coin to pick the winner: orange.

Actual outcome: Oranges win!

Spartans (2) v. Raiders (15) - Again, a close match, but I think the spartans would ultimately win the day.

Actual outcome: Nope - Raiders win!

Final accuracy: 17/32 or 53.1%

Thursday, March 17, 2016

Nationalism vs. Patriotism

I've always avoided being political online - in part because I'm a government employee and in part because I prefer political discussions to be more private and between a few individuals - but I'm sure I'm not alone in expressing some strong fears about the messages coming from a particular Presidential candidate. I couldn't say it better than Ezra Klein, so I'll let him do it:

I'm sure I don't have to explain to you, dear reader, that nationalism is a scary and dangerous thing. This term has been used to describe many of the most terrifying leaders in history. But another term is often thrown around during Presidential elections to describe candidates positively: patriotism.

The distinction between these two concepts is not always clear, and some may conflate the terms. But research suggests they are distinct concepts, although there can be a correlation between the two.

Linda Skitka, who studies political ideology (among other topics) defined these concepts as follows:

Patriotism is "love of country and attachment to national values."

Nationalism, on the other hand, is "uncritical acceptance of nation, state, and political authorities combined with a belief in superiority and dominant status of one's nation."

These concepts can be linked to concepts in intergroup relations research. I've blogged about these topics before, but for a brief reminder, people naturally assign themselves and others to groups. The particular groups one assigns themselves and others to depend strongly on the situation, and what characteristics are made salient. And it doesn't take much for people to begin grouping themselves with similar others (our in-group) and dissimilar others (our out-group), a concept known as the minimal group paradigm.

We are inclined to evaluate our in-group positively, because it allows us to experience a boost in our own self-esteem through our association with in-group members. We call this in-group favoritism. We may also evaluate our out-group negatively, once again to boost our own self-esteem (through downward social comparison) and rationalize why we prefer to belong to our in-group and not the out-group. We call this out-group derogation.

But it's important to point out that you can have one without the other. That is, it isn't necessary to tear another group down to build ourselves up. And that is where the distinction between patriotism and nationalism becomes very important. It is entirely possible to love one's country without also hating another country. And research supports the notion that people can hold positive attitudes toward the in-group without any influence on attitudes toward the out-group.

There are also other factors that can impact (or moderate) these outcomes. For instance, people who identify highly with the in-group and who perceive high levels of intergroup conflict are more likely to ascribe opposite characteristics to members of the out-group (known as out-group contrast). So messages that highlight conflict between other nations could turn patriotism into nationalism, if those messages resonate with the individual.

I'm certainly not telling people to avoid being patriotic. It's fine to love one's country, and it's also possible to love one's country while also recognizing its flaws. But it's important to recognize where those feelings should end - prior to derogating other countries. I strongly believe that knowledge is power - one of the reasons I write this blog - and it is my hope that acknowledging the differences (and similarities) between these two concepts will help people to recognize when patriotism can turn ugly.

And now, because I've done a lot of talking about politics, I leave you with a puppy.

Monday, March 14, 2016

Bracket time!

It's the most wonderful time of the year...

That's right, it's time for the NCAA Men's Division I Basketball Championship, otherwise known as March Madness.

I've been a basketball fan my entire life, and like my family, I'm a huge fan of the University of Kansas Jayhawks. They've had a great record in the tourney: 45 total appearances, including 28 times in the Sweet Sixteen, 21 times in the Elite Eight, 14 times in the Final Four, and 3 championship wins. Once again, they're the Big 12 Champions this year, and were selected as a #1 seed during Selection Sunday yesterday. This gives them a (theoretically) easier schedule during the tournament - the #1 seed plays the #16 seed in the first game and, if they win that game, the winner of the #8-#9 matchup game. They wouldn't have to play a higher seed than that until the Regional Semifinals (depending of course on who wins the #5-#12 and #4-#13 matchups).

In addition to just being a basketball fan, I love the tournament. The tournament is single elimination, and though higher seeds get matched up with lower seeds in their games, anything can happen. There are always upsets, like in 2013 when a #9 seed (Wichita State) made it to the Final Four and a #15 seed (Florida Gulf Coast) made it to the Sweet Sixteen. Or 2014 when University of Dayton (#11) beat Ohio State (#6), resulting in this hilarious headline poking fun at THE Ohio State:

But I really fell in love with the tournament about 10 years ago when I realized it's all about data.

Mounds and mounds of data. So putting together a good bracket is really about using data to make predictions.

The past few years, I've decided to run tests of different bracket methods, to see which one comes out on top. I usually have one bracket I base entirely on who I want to win each game. Then for the remaining brackets, I'll try everything from running statistical models to predict winners (the most data intensive) down to simply flipping a coin (chance). I'll do the same thing again this year and plan to report my results. Stay tuned!

Tuesday, March 8, 2016

Writing Good Survey Questions (or Why I'm Not the Person You Want Responding to Your Surveys)

As I've said before, we all see the world through the lens of our discipline. And the things that irk us tend to be silly and overly specific to the outside observer.

This is why, when I receive a link to a survey, I usually don't respond. Of course, this results in others pointing out that I should be helping my fellow researchers out when I qualify for a survey. But sometimes, it's very difficult to ignore issues with those surveys: poorly worded questions, bad response options, and sometimes nonsensical instructions make it very difficult for me to respond and not be distracted. And if they have a spot for open comments - if the survey was especially bad - I let them know some of the mistakes there.

Yes, it's totally obnoxious and I can't help it. So I find it better to just ignore the survey.

I recently received a "survey question" - which was clearly meant to sell something - at which I rolled my eyes and immediately decided to write a blog post on writing good survey questions. The question was: "Do you want to be your healthiest self?" Like I said, they were selling something, and rules for good survey questions don't really apply in those situations, because the purpose is not to gather information. But I've seen some similarly worded questions on actual surveys.

So here are some of the mistakes I frequently see in survey questions:
  1. Loaded questions - Loaded questions encourage a person to respond in a certain, usually socially desirable way. More specifically, they show the bias of the researcher. The question above is one such example, but many examples are available. For example: Most Americans prefer to purchase products manufactured in the United States. Do you prefer to purchase products manufactured in the United States?

  2. Double-barreled questions - These are questions that ask about two things at once. One of my favorite examples I used when I was teaching research methods came from our course evaluations: The instructor explained concepts in a clear and concise manner. It's definitely possible to be clear but not concise, or concise but not clear. So this question really only works if the instructor is both clear and concise, or neither clear nor concise. If both of these concepts are important to you, you need to have two separate questions.

  3. Ambiguous response options - Let's say I'm surveying researchers and want to know what they use to enter data, and I include response options of: Excel, Access, a spreadsheet program, a database program. If someone uses Excel, they could also select a spreadsheet program. Probably I mean that option to be if they use another spreadsheet program besides Excel, in which case, I should state other spreadsheet program instead. You should of course list the other major options, before offering the other option. I would also recommend including a blank for people to specify the other program, because you'll get people who select other and write one of the answer options in; this way, you can recategorize them later.

  4. Unspecific timelines - For example: Have you felt sad or depressed? Yes, we all have, even the most mentally healthy of us. This question won't tell you anything about who has a current issue with sadness or depression. Worse yet is a question that asks the number of times you have done something, like How many times have you used Facebook? This would be impossible to recall. But add one simple instruction: In the past 7 days, have you felt sad or depressed? or In the past 7 days, how many times have you used Facebook? Now people can recall their feelings or behaviors from a more specific time, and you're more likely to get useful information.

  5. Asks about sensitive topics without guaranteeing anonymity - If you're going to ask about topics like drug use, sexual activity, etc., you need to make sure respondents feel secure that their answers won't be compromised. Otherwise, they may not respond truthfully, and then what's the point of doing a survey? True, some people aren't shy, and you'll get accurate information from them. But a) you won't know who is being truthful and who is not and b) your data won't represent your whole population.

  6. Too much or unclear branching - Sometimes you want additional information for people who respond in a certain way, and everyone else can skip to the next question. But if you have to do lots of branching, you risk confusing respondents, who may just throw away the survey in disgust. If you find you have a lot of branches, ask yourself if it's necessary, and if it is, is it because you're really only interested in a certain subgroup? You might want to consider either reformatting the survey, putting all the non-branching questions first, and then ask your specific group to fill out the rest, or even focusing on that particularly subgroup when sampling. If branching is minimal, find ways to format the survey to make it really clear, like putting a box around the branched questions and/or adding an arrow to show where to go next. (Note: This is really only relevant to paper surveys. Phone or internet surveys are perfect for branching, so you might explore those options instead.)

  7. Negative wording - It's never not a bad idea.

  8. Trying to cram too many questions on one page - We do this to make a survey look shorter and less burdensome to respondents, and it's a bad idea on all counts. If you need to make your survey look shorter, it's probably too long. Second, people are more likely to skip questions if they're crammed together, because they didn't see them. In short:

  9. Asymmetrical response options - I've definitely seen response options along the lines of Strongly Agree, Agree, Disagree. If you have positively worded options, you should try to have an equal number of negatively worded options, such as: Strongly Agree, Agree, Disagree, Strongly Disagree or Strongly Agree, Agree, Neutral, Disagree, Strongly Disagree.

  10. No metric for numerical answers - If you ask a question requiring an open-ended numerical answer, you need to let them know what metric they should be using. A question like How far do you live from your primary care doctor's office? followed by a blank could be answered in many ways. You may think people will give you miles, but if they only live 5 blocks away and you don't tell them to think in miles, you'll misinterpret their answer. Same goes for hours and minutes if you're asking about travel time.

  11. Making respondents rank too many things - I've taken surveys where I'm asked to rank a list in order of importance. If that list has more than 5 items, don't make people rank all of them - just ask them to rank the top 5 (and actually, top 3 is best). When an online survey once made me rank 10 items (and wouldn't let me continue until I gave every item a rank), I closed the survey.

  12. Relatedly, too many response options - Psychologist George Miller found that people can only keep about 7 (between 5 and 9) things in their head at once. If people can pick more than one (a check all that apply question), you should still limit, and if there are that many potential options, try to split into more than one question. But you shouldn't have more than 7 forced choice options; otherwise, people might forget the first option before they reach the end of the list.

  13. Open-ended questions when you should use a close-ended question - On one large survey I worked on, the surveyor included an open-ended question for something that could have been summed up with 4 or 5 options plus an other, specify option. As I feared, the open-ended data were a complete disaster and totally unusable. Every once in a while, I think I should go through and try to categorize, because it's an important question, but then I see the breadth of responses and curl up in a ball.
Now, in some of the above examples, I gave neutral options. This is a hotly debated topic in survey research. Some think neutral options are a bad idea and prefer to force people to "pick a side." On the other hand, others think neutral options are necessary, because people may not have an opinion on an issue and may skip the question entirely if they don't have an option they agree with. I'm generally a fan of neutral options, but there's really not a right or wrong answer here.

The best thing you can do when creating a survey is cognitive testing. What this means is giving your draft survey to a small number of people one-on-one, and asking them to "think out loud" as they read the questions. You may also ask them to paraphrase questions, to get better wording and make sure the question makes sense. This gives you a glimpse of how people approach your survey and if they're struggling with any of the issues identified above or other issues entirely, like unfamiliar terms (jargon) or important topics not assessed by the current questions (which we call "content omission").

As with most topics on this blog, I'm oversimplifying, and there are more nuances to survey design than I've discussed here. In addition to what I've recommended above, you really should have a survey methodologist on board, especially for very large surveys. Most of these considerations aren't intuitive (and that's okay - you don't have to be an expert in everything!), and designing a good survey can be an iterative process, with cognitive testing and even pilot-testing needed. Find an expert to do that dirty work, and wait for the data to come rolling in!

Sunday, March 6, 2016

Association for Psychological Science Convention 2016

It's been years since I've attended the Association for Psychological Science (APS) annual meeting. My research interests have shifted since I started working for VA, and while many of the topics I study now are still relevant to psychological science, we often focus our limited travel dollars on conferences more directly relevant to health services research. I traveled to APS a couple of times on my own dime, but haven't done that for the past 3 or 4 years.

That being said, APS was always one of my favorite meetings to attend, and I love this organization. I get to network with fellow research psychologists. APS even gave me my first ever research award in 2010, which was just the validation I needed to decide that I wanted a career in research; previously, I worried that I wasn't that good at research, and thought of myself as a better teacher. So I'm super-excited that APS is in Chicago this year! Even though I didn't submit anything for presentation, I still plan to be there.

Friday, I received an email about the various workshops being offered at APS on Thursday, May 26 and Sunday, May 29. There are so many great ones, it's going to be difficult to narrow down. Here's just a few I would love to attend:
  • The Theory and Practice of Machine Learning in Psychology: The What, Why, and How of a Powerful Statistical Technique (Thurs., 9:00am - 12:20pm)
  • Structural Equation Modeling With Lavaan (Thurs., 12:00pm - 3:50pm) - I recently started using this package, which I'm thrilled to report is pretty user-friendly, but I'd love to learn more about using this package to its fullest
  • Improving Reproducibility of Our Research Practices (Thurs., 1:00pm - 3:50pm) - including getting the most out of the Open Science Framework, which I blogged about the other day
  • Using Psychological Science to Write With Clarity and Style (Thurs., 4:00pm - 5:20pm)
  • Uses and Challenges of Mechanical Turk (Sun., 8:30am - 11:20am) - I'm seeing more and more research being conducted using Amazon's Mechanical Turk, which is great place to do online research in a more diverse population
  • Elegant Multilevel Modeling (Sun., 8:30am - 12:20pm)
  • Methodological Approaches to Designing Adaptive Interventions in Mobile Health (Sun., 8:30am - 12:50pm)
  • BayesFactor and JASP: A Fresh Way to Do Statistics (Sun., 10:00am - 12:50pm)- Bayesian statistics: the way of the future
Obviously, I can't go to all, and some listed above run concurrently, so I'd have to chose: SEM with Lavaan or Open Science Framework? Mechanical Turk, Multilevel Modeling, Mobile Health, or Bayesian? Anyone have a cloning machine so I can attend every workshop?

Friday, March 4, 2016

On Open Data and its Benefits and Drawbacks

Open Access Button, an app designed to help people access scholarly research (and report when they're denied access), will be releasing a beta version of a browser add-on, the Open Data Button, tomorrow (March 5, as part of Open Data Day). The beta currently works with Chrome, and a Firefox add-on will be available soon. This handy button is designed to help people access the data behind scholarly research: if the data are available in the Open Science Framework, it will give you a link to it, and if it isn't available, it will start a request to the author to make the data available.

This browser add-on is the latest event in the call for increased transparency in scholarly research. And secondary data analysis can make important contributions to the literature. Just as researchers shouldn't "reinvent the wheel" with data collection instruments - instead drawing on past work - it would be a much better use of resources to reanalyze previously collected data that can answer your question than going through the resource-intensive process of primary research. Secondary data can also be used in education, such as in statistics courses. When I took structural equation modeling in grad school, our final project was supposed to use secondary data (which could have been our own data if we had something that would work), so that we could jump right into practicing analyses with data, instead of spending time collecting the data first.

Despite all the potential benefits of open data, researchers are still hesitant to make their data freely available. When I was teaching research methods, I contacted researchers of recent articles I had assigned as course reading - my students had asked for more review and experience with statistics, and I thought analyzing the data in real-time while they had the article with the aggregated results in front of them would help solidify these concepts. I was very clear in my emails that I was requesting the data purely for a classroom exercise, and had no intention to conduct additional analyses or publish anything I found in the data. I heard back from two researchers, who both said no and were quite defensive about it. One even claimed to no longer have the data, despite the fact that the study had been published that year (to be fair, there can be a large gap between completing the research and getting the work published). The rest never responded.

Why might these researchers have been so against the idea? The jaded response is the fear that we might not confirm the published results, which could mean anything from simple accidents (typos in data tables) to purposeful fraud. Of course, there are other perfectly understandable explanations:
  • The researchers have additional papers planned with the data that are in (or soon to be in) progress - There's nothing worse than getting scooped: coming up with a great idea to discover someone else got there first. Imagine how much worse it would be if they scooped you with your own data that you painstakingly collected.
  • Releasing the data could risk participant confidentiality - Obviously, freely available data should never contain identifiers, like name or address. In fact, for most of our research at work, anything considered identifiable information (based on HIPAA) is stored in a separate crosswalk file, which is linked to the data by a randomly assigned ID number. However, research has shown that even some basic demographic data can make participants identifiable. For instance, a study using 2000 census data found that 63% of the US population could be identified with gender, zip code, and birth date. Redacting that information down to year or year and month of birth, and/or county instead of zip, drastically decreases the proportion of identifiable participants. Studies done on a much smaller scale than the census, and in more specific populations or organizations, could also introduce risks to participant confidentiality. While researchers can, and should, clean their datasets before making them public to remove any sensitive variables, it may not be initially clear how these variables or the combination of multiple variables make your sample potentially identifiable.
  • Making the data freely available eliminates the possibility of tracking usage/data use agreements - Even a researcher who is willing to share his/her data might want to know who is using it and how. Data use agreements could be used to prevent the potential for scooping mentioned above, by stipulating how the data can and cannot be used. And just as researchers are excited to see who is citing their work, they may also be excited to see who is using their data. Of course, this button would allow tracking to some extent - only once the research is published.

Overall, I do want to be clear that I think open data is a good idea, though there are important situations where it simply isn't a possibility. I wonder if/how this could be used with qualitative data. Obviously, one possibility is simply to have redacted transcripts/fieldnotes available for download as a PDF or Word document. However, unlike quantitative research, where variables are clearly labeled as representing important constructs, qualitative data is by its nature unstructured, making it less obvious how a codebook was applied. And single lines of text can represent multiple constructs, depending on the particular coding approach a researcher used (that is, some approaches allow for simultaneous coding of the same text, while others do not). So alternatively, it might make sense to include reports from qualitative analysis software, where researchers can pull out all quotes/text they've classified within a particular code.

This is all just one researcher's opinion, and others will likely have very good reasons for a different opinion. What do you think about the open data movement?

Tuesday, March 1, 2016

The Province of Men: Gender Bias Research on the First Day of Women's History Month

March is Women's History Month (read more about it here ). So it may be for this reason that one of my psychology list-servs sent out several links to recent research on gender disparities.

Two used freely available data to examine differences in perceptions about the quality of women's work and value of products sold by women.

The first study (summary and full preprint) examined open source code on GitHub, and found that code written by women was more likely to be approved by users than code written by men, but only when gender of the coder was hidden. Specifically, the approval rate was 78.6% for "blind" code written by women (compared to 74.6% for code written by men) but was only 62.5% for code written by women when gender was specified on the user's profile.

The second study found that products sold on eBay by women received fewer bids and lower final prices than the same products sold by men. Unlike GitHub, gender is not available on a seller's profile, but a supplemental study by the researchers found that people could identify gender of the seller based on copy in the product posting.

Another examined the way women are perceived when they seek out positions of power or promotions/raises. They begin with Hillary Clinton as an example case (though, as they are quick to point out, a very unique one). That is, when Clinton stepped down as Secretary of State, her approval rating was 69%. Contrast that with current sentiment about Clinton, including mistrust and the perception that she is loud and angry. The author argues that approval of Clinton drops when she seeks a new position and "soars" after we've seen her do the job. It's the difference between assertive men being called a "boss" and assertive women being called "bossy" - one is a compliment, the other is harsh insult.
As I've blogged about before, personal characteristics like gender can be used to interpret behavior and evaluate an individual, particularly when you don't have other important information to make that evaluation (such as expertise necessary to determine what "good" looks like, or personal knowledge of the individual that outweighs these details). And any evaluation using these characteristics will draw upon stereotypes about how a member of that group is "supposed" to act. Correcting for these stereotypes is liable to be difficult, if they are firmly ingrained, and suppressing this information can result in a stronger reliance on that information (see a previous blog post about thought suppression).

So what do we do with this information? The first study, on GitHub submissions, would suggest that one correction for gender bias is simply to remove gender from the equation, by blinding evaluators to this information. Of course, this 1) isn't feasible in many cases, such as in politics, and 2) does nothing to change stereotypes about gender and expertise. We need strong women. We need women who excel in stereotypically male fields. And we need to change the way we evaluate and interpret women's behavior. Unfortunately, there's no quick way to do that, and like many cultural changes there will be some growing pains.

The only way is to keep pressing forward and pushing boundaries whenever possible. There is a compliance technique in social psychology and business that springs to mind: foot-in-the-door technique, which involves getting a person to agree to a large request through smaller, incremental requests. As the individual goes along with the small requests, that get larger over time, they may experience a change in attitude, through processes like cognitive dissonance. Each small push toward large-scale social change can bring about attitude shifts. One could argue that any policy change (women's suffrage, civil rights, changes to marriage laws) is obtained through small requests that grow over time. But for a change this large, no single person can carry the burden.

The big question, of course, is what will happen in the 2016 Presidential Election. I certainly would never tell anyone who to vote for, and would not suggest one vote for Hillary because she is a woman. But neither should people continue to negatively evaluate a candidate simply because she is a woman.