Science

Their Science Is Not Broken, Assert Psychologists

Contradicts claims that only 40 percent of studies could be replicated.

|

BrokenScienceReason
Jason Keisling

Researchers at the Open Science Collaboration (OSC) trying to replicate the results of 100 prominent psychology studies found that only about 40 percent of them could be. These disheartening results were published in Science and widely reported in the popular press. This finding was in line with the growing number of instances in which researchers were reporting that they could not reproduce the published work of their colleagues. In fact, as early as 2005, Stanford University statistician John Ioannidis had asserted that most published research findings are false.

Now some leading psychologists are charging that the results published by the Open Science Collaboration are, in fact, false. In a rebuttal commentary published in today Science, they argue that mistakes in various attempts at replicating research led the Collaboration researchers to dramatically underestimate the actual reliability of the studies being examined. Among other things, they claim that

many of OSC's replication studies drew their samples from different populations than the original studies did. An original study that measured American's attitudes toward African-Americans was replicated with Italians, who do not share the same stereotypes; an original study that asked college students to imagine being called on by a professor was replicated with participants who had never been to college; and an original study that asked students who commute to school to choose between apartments that were short and long drives from campus was replicated with students who do not commute to school.

These are pretty clearly serious flaws, if true. The researchers also question the overall study's statistical analysis of the replication data:

OSC used a benchmark that did not take into account the multiple sources of error in their data, used a relatively low-powered design that demonstrably underestimates the true rate of replication, and permitted considerable infidelities that almost certainly biased their replication studies toward failure. As a result, OSC seriously underestimated the reproducibility of psychological science.

According to the New York Times, University of Virginia psychologist Brian Nosek, who headed up the OSC replication study, countered

that the critique was highly biased: "They are making assumptions based on selectively interpreting data and ignoring data that's antagonistic to their point of view."

Only time will tell if the rebuttal is the grousing of some embarrassed old-timers or if psychological research is actually reliable. Stay tuned.

In the meantime, the Open Science Collaboration is working on a replicability study of cancer biology research. Perhaps the results derived from cells and lab rats will be more tractable than those from psychological assays.

For more background, read my feature article dealing with the replicability problem, "Broken Science."

NEXT: Trump Supporters—in Their Own Words—Demonstrate How Little GOP Understands What's Happening

Editor's Note: We invite comments and request that they be civil and on-topic. We do not moderate or assume any responsibility for comments, which are owned by the readers who post them. Comments do not represent the views of Reason.com or Reason Foundation. We reserve the right to delete any comment for any reason at any time. Report abuses.

  1. Let’s assume for a moment that the critique is true. Fine.

    That probably suggests that the results of the studies are a lot less generalizable than the public was led to believe (if not by the psychologists, then by the media).

    1. the results of the studies are a lot less generalizable

      That’s just a rule of thumb. I always take claims of external validity with one extra grain of salt as no sampling model involving human beings is perfect, even if it’s “stochastic”.

  2. Shocker. This is my shocked face.

    Only time will tell if the rebuttal is the grousing of some embarrassed old-timers or if psychological research has a real reproducibility problem.

    I’d bet actual money that there is a terrible reproducibility “problem”. More like endemic failure. Far too many studies are grossly politicized–and incredibly obviously as well–and that absolutely devastates any actual “science” they are attempting to perform. This is a problem that has been growing obviously steadily worse for, well, decades now.

    Politicizing shit makes people stupid; it’s just another facet of partisanship, which makes people retarded. And now the retards have come home to roost. Do retards roost?

    1. If “roost” is a slang term for porking your mom, then yes, they do.

      1. Where does the want for cake fit in?

        1. The cake is for after.

      2. Yo, let me tell y’all a story ’bout a girl I knew, / A broke-ass bitch with a gay-ass crew. / She said that I was cute. She said that I was funny. / But the honey couldn’t stop lookin’ at my money. / Busted old lady with a flat tiny ass. / Her body like a skeleton in science class. / Face beat up by the school of hard knocks. / Hair so fried it bleached my Clorox. / It’s like she’s skinny, fat in all the wrong places. / Mothers gotta cover their babies’ faces. / When she walks by people think she’s Godzilla. / Straight outta Compton? Nah, y’all, straight outta “Thriller.” / Lookin’ like a zombie, walkin’ like a chicken. / Mouth full of shit that’s why her breath be stinkin’. / Just one question, Dee, before you take your bow. / This gravy train’s leaving, so who’s retarded now?

    2. In fairness, even real science can be hard to replicate properly. I think the rush to publish and the pressure to prove out your grant has encourages sloppy procedure.

      1. True, but if you take the examples of sloppiness used, it’s more than that. It’s basically that either they don’t know how to do double-blind studies properly (and *why* you do them), or…they just don’t care.

        I’m betting it’s the latter the majority of the time.

        1. It’s basically that either they don’t know how to do double-blind studies properly (and *why* you do them)

          To be fair, double-blind studies aren’t always feasible or appropriate for psych experiments; that doesn’t mean you can’t glean meaningful data as long as you take care to investigate the phenomenon in an systematic and empirical way. That having been said, for quasi-experiments and correlational studies, researchers should display more epistemological humility (for lack of a better term) when reporting findings and clearly detail the limitations of the study. The fact that their experiments don’t resemble experiments in the natural sciences shouldn’t bother psychologists unless they have ‘Physics Envy’. Of course, this is all before taking into account the politics of promotion committees, funding, etc. But again, that’s not the Academy’s problem, it’s up to the discipline to advocate for its importance and to protect its perception of methodological rigor.

        2. In the case of psych studies, one issue is that there seems to be fairly little interest in using randomized populations. Partly this may be because it may be difficult,part may be laziness, but a big component – especially for ‘newsworthy’ politically relevant ‘research’ – it’s because self-selecting sample populations increase the odds of getting the positive result that the researchers want to find.

          In medicine, it is worth noting that clinical trials are often difficult to replicate, due to small and non-random populations; however, that’s usually not due to lack of effort. It’s a lot easier to find a hundred people to fill a survey than to find two hundred people (anticipating that as many as a hundred will drop out or die) with a particular disease and meeting all the other criteria. And the regulations don’t make it any easier.

    3. In fairness, even real science can be hard to replicate properly. I think the rush to publish and the pressure to prove out your grant has encourages sloppy procedure.

      1. However, squirrels never seem to have a problem with replication.

    4. Yeah, I see a lot of studies where they actually set out to ‘prove’ that conservatives lack empathy or are way dumber than liberals. Anyone who wants to study such a topic is obviously going to be predisposed to tilting the experiment or will engage in confirmation bias.

      There’s actually an evolutionary psychologist named Satoshi Kanazawa who’s written several hilariously flawed articles on just this topic.

      “Defined as such, liberalism is evolutionarily novel. Humans (like other species) are evolutionarily designed to be altruistic toward their genetic kin, their friends and allies, and members of their deme (a group of intermarrying individuals) or ethnic group. They are not designed to be altruistic toward an indefinite number of complete strangers whom they are not likely ever to meet or interact with. This is largely because our ancestors lived in a small band of 50-150 genetically related individuals, and large cities and nations with thousands and millions of people are themselves evolutionarily novel.”

      This fucking idiot thinks a) political beliefs are purely the product of evolution and b) that something which is ‘evolutionarily novel’ is definitely better than that which is not. I don’t think he knows how evolution works.

      1. Weird. I didn’t know giving away other people’s money constitutes altruism.

        1. He also thinks expanded altruism is somehow linked to greater intelligence when there’s literally no reason to believe this is true.

          You know the nicest people I’ve ever met? People with Down Syndrome. Retarded people tend to be very nice and very giving.

          Would he therefore argue that Down Syndrome makes people smarter because they’re less selfish?

          1. I think it’s parenting. Nice people are more likely to keep a mentally challenged child. Of course he probably wouldn’t like that interpretation either.

          2. Me mum wrote a short story for a college class involving aliens subtly colonizing earth by inserting extra chromosomes into certain fetuses. That’s right: Downs kids are alien hybrids.

          3. He also thinks expanded altruism is somehow linked to greater intelligence when there’s literally no reason to believe this is true.

            I’ve wanted to get this or similar forged on a hammer so I could use it on anyone who’s ever used the Prisoner’s Dilemma as an argument for policy;

            Nevermind that you just have to magically have faith in one or all of the other people and that individual empiricism/rationality/risk aversion dictates the exact opposite decision, it’s okay to forcibly rape someone and, as long as we all agree on our stories or the best course of action, the best outcome for all of us can be achieved.

            1. The “this’ being your quote.

            2. What about the ‘tit for tat’ strategy for the iterated Prisoner’s Dilemma?

              1. The fallacy doesn’t lie with the solution to the dilemma, it’s the implied or underlying ‘is/ought fallacy’ that is near universally disregarded.

          4. That’s funny. I’ve always figured that, from a strictly evolutionary standpoint, sociopaths are the smartest people. They waste the least energy on activities not conducive to their own survival and procreation (on things like helping others) and are least likely to let societal norms and moral inhibit their actions toward survival and procreation.

            Evolutionary, from an individual’s perspective, altruism is quite stupid.

      2. If I were a purveyor of stale, undank memes, I would point out the fact that you didn’t chose as your exemplar of Kanazawa’s terribleness the study where he claimed empirical evidence suggests that Black women are the most unattractive group. Of course, that could be due to the fact that in the same study Kanazawa claimed that Black men were the most attractive.

        1. What is ‘latent physical attractiveness?’

          1. I assume he means that the construct “physical attractiveness” in his study was measured as a latent (as opposed to observed) variable, like, say, through factor analysis.

        2. Making “Asian” a single category is pretty funny. To my tastes, at least, “Asian” includes both the most and least attractive racial/ethnic groups.

    5. And even if the psychology studies are reproduceable, most of them probably still don’t tell us much of anything that is useful or applicable. At least not in the ways the designers of the studies claim.

      1. most of them probably still don’t tell us much of anything that is useful or applicable

        Would you have the same criticism of a paper on Graham’s Number?

        1. A lot of mathematics isn’t terribly useful or applicable either. But it has the advantage of being actually proven. Math isn’t science, so reproduceability isn’t an issue. But it’s also true that most really new mathematics these days is pretty unlikely to have much practical application.

          I’m not saying that people should just give up on trying to do social science studies. I think there are lots of things to be learned there. But I also think that an awful lot of what does get done mostly just demonstrates how people behave in contrived social science studies. That doesn’t mean there isn’t a lot of interesting and worthwhile work done in anthropology, sociology, etc.

          1. A lot of mathematics isn’t terribly useful or applicable either. But it has the advantage of being actually proven.

            Of course, G?del argued that there will always be unprovable statements within a particular formal system.\

            I’m not saying that people should just give up on trying to do social science studies.

            Fair enough. My question was more about “pure vs. applied” than the perceived rigor of any particular discipline. In my experience, when people criticize pure research, then tend to point to a field like psychology as opposed to mathematics or cosmology.

            1. I’m pretty familiar with Goedel’s incompleteness theorems. Most things that people want to prove can still be proven in those systems. OF course, you still assume the axioms. But it’s all nice and neat.

              I’m all for pure research. I just happen to think that a lot of social science research is bullshit. “Useful or applicable” maybe wasn’t quite the right standard for what I was trying to say.

              1. It’s also worth noting that Graham’s number or other pure mathematical research won’t get passed around on buzzfeed to everyone in the world and their grandmother as ‘proof’ of some social, psychological, or political ‘fact’ when in truth it is entirely spurious.

                Poorly done social science research, due its policy implications and the readiness of people to believe it, thinking it has the imprimatur of science, makes it potentially very harmful, in some cases not entirely unlike a poorly designed clinical trial.

  3. The psychologists who think the replications are valid should do their own studies. Everything of interest should get replicated with larger sample sizes. It is one of the reasons we are still sure F=ma.

    1. Well, sure, if you’re into gendered equations and misogynistic science.

    2. F=dp/dt

      Less assumptions.

      And closer to what Newton said.

  4. Psychology and much of the social sciences don’t even rise to the level of tautologies.

    1. Bullshit, based on my totally scientific (yeah that’s right, I scienced the shit out it, bitch) survey of incoming college freshpersons, I have concluded that all non-progressive individuals are totally whack-a-doodle.

    2. Psychology and much of the social sciences don’t even rise to the level of tautologies.

      I’d even expand this to epidemiological sciences/studies and assert that any science that’s based solely on statistics for truth is inherently studying the research system as well as the phenomenon under study.

      1. “any science that’s based solely on statistics for truth”
        All science is essentially dependent on statistics, just in some cases the variance is so small that the probability of a false positive is negligible. And all science is implicitly studying the research system. If you sequence a bunch of people’s genomes and look for point mutations associated with physical features like disease, you are implicitly studying the effectiveness of the sequencing machine.

        In epidemiology, I imagine most population sizes are fairly small, and humans populations tend to be heterogeneous and not uniformly distributed geographically, making things more difficult. So in a lot of cases the meaningful conclusions are to be drawn not from each individual study but the meta-analysis of all of the studies.

  5. An original study that measured American’s attitudes toward African-Americans was replicated with Italians…

    I wanted this sentence to end right then and there so badly.

    1. I bet the conclusion was a whopper

    2. “Hey-a, these moulingon are-a all-a-right, si?”

  6. It was broken ten years ago when I was a psych major and I’m fairly certain it’s a lot worse now. Many (most?) of the studies are pseudoscience and much of it, particularly in social psychology, is ideologically biased.

    1. No, no, no. The people who disagree with them are ideologues. They themselves are reality-based.

  7. “According to the New York Times, University of Virginia psychologist Brian Nosek, who headed up the OSC replication study, countered that the critique was highly biased: “They are making assumptions based on selectively interpreting data and ignoring data that’s antagonistic to their point of view.””

    And I’m sure a person involved in a psychological study would never do that…

    1. Yeah, that statement smacks of projection to me.

      1. Smacks? It’s flat out projection. And all of these studies are fucking rife with projection from the “researchers”.

        1. It may not be projection so much as it’s “researchers who assured their granting agency that this is the outcome they’d get.” When request for proposals are posted, it is incredibly clear what applications will be successful.

    2. Nosek’s areas of research:

      Stereotyping and identity. Implicit gender stereotypes about math and science predict girls’ and boys’ engagement with science even among people who consciously reject the stereotype.

      Ideology. Ideologies and motives need not be consciously articulated or available to exist and influence behavior (Nosek, Graham, & Hawkins, 2010).

      1. Ideology. Ideologies and motives need not be consciously articulated or available to exist and influence behavior

        If they are not available, then how can he evaluate them when considering behavior differences?

    1. “You broke my favorite mug, prepare to die!”

    2. I really like john wick, it was a good rental.

    3. “Take that, Tom Cruise!”

      I am not a 3-Gunner, but that looks awfully quick to me. Kudos to Reeves for putting the work in.

  8. Yeah, interesting. We used to hold our studies to tighter p levels and stuff precisely because psych was so “soft” and results were so hard to replicate. But I was fortunate as an undergrad to have excellent program that really focused on research methods and “good science”. I’m sure it’s worse now.

    I miss those days. Was involved in normalizing a bunch of tests and some other big-time research that graduates brought back to us to participate in. Big time shit for little undergrad pysch students!

    Plus, the lab rats were fun…

    1. “Stop feeding the lab rats to Florida Man’s pet python!”

    2. Problem is the ethics of doing proper controls.

  9. For something to be broken does that not assume it functioned in the first place?

  10. So does this mean psychohistory can never develop?

  11. Home income source by collider.. I’m making over $5k a month working part time. I kept hearing other people tell me how much money they ZD can make online so I decided to look into it. Well, it was all true and has totally changed my life.
    This is what I do, http://www.payability70.com

    1. That’s all well and good, but are the results reproducible?

      1. Home income source by supercollider?!

  12. I fucking love science

  13. Witchcraft and sorcery have better reproducibility. No, really.

  14. OT: but speaking of scientists wtf is going on with this ghostbuster remake? and why are so many people offended by it? And why are their guides on how to talk about it?

    1. I watched the trailer and thought it looked like a bad movie. I’m not clicking on the rest of those links. You can’t fool me.

      1. I’m not clicking on the rest of those links.

        TL;DR, WHY NO BLACK WOMAN SCIENTIST?!?!

        /This is why it will suck.

    2. but speaking of scientists wtf is going on with this ghostbuster remake? and why are so many people offended by it?

      Ghostbusters was pretty original and, somewhat more importantly, the band broke up at the right time. SNL hasn’t been funny for almost an entire generation and the last thing we need is Bridesmaids 2: with Ghost from some of the most unfunny of them.

      IMO, there’s also a distinct element of the idea that you *have* to accept women like Mellissa McCarthy and you have to accept them in roles like this or you’re a misogynist. If you’re tired of the zombie genre but still found Pride and Prejudice and Zombies interesting, it’s only because you’re a bloodthirsty violent sexist.

  15. For the most part, rebuttals to the reproducibility crisis have only shown that the rebutters don’t even know how experimental design is supposed to work. This appears to be no exception.

Please to post comments

Comments are closed.