A Famous Study Found That Blind Auditions Reduced Sexism in the Orchestra. Or Did It?

More implicit bias research comes under scrutiny

Robby Soave | 10.22.2019 10:00 AM

(Yuri Gurevich | Dreamstime.com)

One of the best-known scientific studies to posit that implicit bias—the idea that all people are unconsciously racist, sexist, etc.—can be counteracted via strategic effort is taking a well-deserved beating. It now appears that the findings were significantly overstated.

The study, "Orchestrating Impartiality: The Impact of 'Blind' Auditions on Female Musicians," by Harvard University's Claudia Goldin and Princeton University's Cecilia Rouse, was released in 2000. Its bombshell finding was that blind orchestra auditions—which prevented the choosers from seeing whether each auditioner was male or female—increased female auditioners' odds by 50 percent. The American Enterprise Institute's Christina Hoff Sommers notes that the study was "lionized by Malcolm Gladwell, extolled by Harvard thought leaders, and even cited in a dissent by Justice Ruth Bader Ginsburg." Jesse Singal, a contributing writer at New York magazine who has often criticized bad social science (and is writing a book about misleading statistics that have gone viral), has actually cited the orchestra study as one of the more important entries in the field of implicit bias. In a review of Jennifer Eberhardt's book, Bias, Singal wrote:

Eberhardt also presents some difficult-to-refute findings about the role of implicit bias in the real world, including famous studies in which identical résumés are sent out with white- and black-sounding names (with the white ones getting far more callbacks) and another famous experiment in which a screen shielding performers' identities during auditions led to more gender parity in previously male-dominated orchestra hiring. The results of these tight, elegant experiments suggest that implicit bias is at least part of the equation. And in some cases they offer clear partial solutions, such as shielding certain identifying information about job applicants and focusing more on tests of ability to perform the tasks required by a position.

My point is that the in an area of research fraught with replicability problems, the orchestra study was supposed to be one of the good ones.

Well, so much for that. In May, Columbia University statistician Andrew Gelman took a deep dive into the study. He described them as "not very impressive at all," and had great difficulty trying to locate the 50 percent statistic within the modest findings.

"You shouldn't be running around making a big deal about point estimates when the standard errors are so large," he wrote. "I don't hold it against the authors—this was 2000, after all, the stone age in our understanding of statistical errors. But from a modern perspective we can see the problem."

Sommers wrote about this discovery in a recent piece for The Wall Street Journal, which will probably attract more attention. She noted the existence of another study that had contradicted Goldin and Rouse:

In 2017 a team of behavioral economists in the Australian government published the results of a large, randomized controlled study entitled "Going Blind to See More Clearly." It was directly inspired by the blind-audition study. Iris Bohnet, a Harvard Kennedy School dean and Goldin-Rouse enthusiast, served as an adviser.

For the study, more than 2,000 managers in the Australian Public Service were asked to select recruits from randomly assigned résumés—some disguising the applicant's sex, others not. The research team fully expected to find far more female candidates shortlisted when sex was disguised. But, as the stunned team leader told the local media: "We found the opposite, that de-identifying candidates reduced the likelihood of women being selected for the shortlist." It turned out that many senior managers, aware that sexist assumptions had once kept women out of upper-level positions, already practiced a mild form of affirmative action. Anonymized hiring was not only time-consuming and costly, it proved to be an obstacle to women's equality. The team plans to look elsewhere for solutions.

Blind interviews and auditions may be preferable for other reasons. They may even reduce implicit bias in some situations. But as is so often the case, the sweeping claims of social scientists do not seem to survive scrutiny.

It will be interesting to see if diversity coordinators—many of whom incorporated this shoddy scholarship as part of their training seminars—adjust course, though I wouldn't count on it. The entire concept of microaggressions lacks scientific legitimacy, after all, but this hasn't stopped college diversity czars from policing them.

Sommers has more on the orchestra study—in video format—here.