Can Psychology Experiments Be Validated? Yes, According To Scientists Who Found High Replication Rates For Studies
Wandering through a psychology department, you may be struck by the painful uselessness of a line of inquiry (are we programmed to laugh when tickled?) or the utter mismatch of subjects for an experiment (college kids as sole participants in a research study of work attitudes). No wonder psychologists get a bad rap. No wonder, too, the gleeful headlines following a 2015 report that showed more than half of 100 psychology studies could not be replicated, rendering them invalid.
However, that well-publicized 2015 report from the Open Science Collaboration got it wrong, say Harvard researchers, whose own examination of the data suggests the OSC scientists, a loose network of researchers, professionals, citizen scientists, and others, made some serious mistakes. In the end, the replication rate in psychology is quite high, says the Harvard team. In fact, it is statistically near perfect.
Weeding the Garden
When a design is too busy, when a room becomes crowded with furniture, when an essay contains too many captivating, stray thoughts, the beauty or central theme gets lost. Paring away the inessential is a necessary part of any invention, sometimes we simply need to cut, cut, cut in order to get to the truth. Scientists, who routinely challenge and test theories, are no strangers to that process. As noted in this North Carolina State University paper, reproducibility is one of many tools necessary for rooting error from the discovery process, while Dr. Daniel Gilbert, lead author, and his colleagues explain in their published commentary, "replication of empirical research is a critical component of the scientific process."
For this reason, then, the OSC's replication experiment is important not just to scientists but all of us. To begin, they selected 100 studies and reached out to the original scientists.
Replication Crisis?
Before conducting their experiments, the OSC asked the original psychologists to examine the planned reproduction and endorse it as faithful to their work. Nearly 70 percent of the original psychologists endorsed the OSC replications. After conducting their own experiments, the OSC group found that, depending on the criteria used, only 36 percent to 47 percent of the chosen studies were successfully replicated.
For their commentary on the replication study, Gilbert, a psychologist at Harvard, and his colleagues scrutinized the methods used by the OSC and reanalyzed the raw data. Immediately, they noticed a problem with how the 100 original studies had been selected. Namely, the OSC did not randomly sample from the total population of psychology studies, yet they also did not make a statistical correction for the fact of not doing so. As Gilbert and his co-authors explain, a researcher must do one or the other for validity’s sake.
Gilbert and his colleagues also calculated that the low-fidelity studies were four times more likely to fail than the studies endorsed as faithful to the original; this suggests their reproduced experimental designs may have been biased toward failure. Finally, they say the OSC used a low-powered design, one destined to underestimate the replicability of psychological science from the start.
“As a result, OSC seriously underestimated the reproducibility of psychological science,” concluded Gilbert and his co-authors.
Source: Gilbert DT, King G, Pettigrew S, Wilson TD. Comment on “Estimating the reproducibility of psychological science.” Science. 2016.