Why can’t we reproduce so many scientific findings?

Why would someone do the same experiment more than once, if they already found the answer? Isn’t that the very definition of insanity? Being able to do an experiment over again and confirm the results is actually a crucial aspect of science, called reproducibility. If scientists find a certain result from an experiment, but no one else is able to replicate it, that’s a red flag. It indicates that the first scientists could have made a mistake or that the original study design wasn’t sound.

“Reproducibility is truth. Without reproducibility, we cannot make significant progress in human health.”

Dr. Bibiana Bielekova, Issues in Science and Medicine

Unfortunately, a shockingly large proportion of the science we do in the US may not be reproducible. In a landmark study in 2012, cancer researchers Glenn Begley and Lee Ellis tried to replicate 53 studies of cancer biology, and found that they could only get the same results for six of them. Another 2015 study to reproduce 100 psychology experiments found a similar result for fewer than half of them.

Barriers to reproducibility

What are some of the barriers that scientists face when trying to reproduce experiments, and how can we overcome these barriers? Researchers at the Center for Open Science undertook an incredibly ambitious task to find out, as part of their Reproducibility Project. They spent eight years trying to replicate 193 experiments from 53 papers on cancer biology that were published between 2010 and 2012. Even after all that time, they were only able to replicate 158 effects from 23 papers.

Here’s why it was so hard. As the researchers explain in an accompanying paper, only four experiments (out of 193!) had publicly-available data to compute the effect sizes and other calculations. Most experiments required scientists to ask original authors to share a key reagent (a substance or mixture used to create a chemical reaction). And none of the experiments were described in enough detail in their original papers so that other scientists could repeat the experiments. This means the researchers had to go back to all of the original authors to ask for their help — but the authors for one-third of the experiments weren’t helpful or never responded.  

When they could replicate the experiments, the researchers found that the results were less impressive than the original findings; average effect sizes of reproduced studies were 85% smaller than in the original findings. Most of the studies that originally had negative findings were successfully replicated, while fewer than half of the studies that originally had positive results were able to be replicated.

Just because a result isn’t reproducible doesn’t mean it’s necessarily wrong. There can be differences in results can be due to changes in protocols (how the study was conducted), which statistical tests were done on results, and the skill of scientists doing these projects. However, the study authors noted times in which even the original labs could not figure out how to redo their experiments, either because the scientists had left or the protocols hadn’t been recorded. That’s not an encouraging sign for reproducibility.  

Moving forward

How do we address the reproducibility crisis? Researchers point to the need for more incentives to encourage scientists to reproduce others’ findings. In the current science landscape, researchers are rewarded for publishing new research, not validating the research of others. If you were a funder, would you rather spend money on an experiment to uncover a new potential cancer cure, or to redo a previous cancer biology experiment? The second one just doesn’t sound as exciting.

“[Replication] won’t make a career,” said, Tim Errington, a cancer biologist at the Center for Open Science and lead author of the recent study, in STAT News. “It’s not the flashy science that people want, not a positive result, because they’re redoing something. So we need to figure out how to balance that as a culture.”

There are a few bright spots in the research world when it comes to changing this culture. A new consortium of funders called “Ensuring Value in Research” promotes research with pre-validated methods and open-access publication, which could help encourage more sound and replicable science. There are other small fixes that would make it easier for researchers to replicate experiments, like journals eliminating word count limits in methods sections or encouraging researchers to submit supplemental material with their full study protocol.

The Center for Open Science has developed another approach called Registered Reports, in which scientists submit their study methods to peer review before they conduct the experiment. This helps reduce the bias that can occur when researchers feel pressure to adjust their methods or statistical tests to get a certain result. And it saves time because researchers don’t have to redo experiments if their methods weren’t up to par the first time.

And technology may be a promising way to reward sound science and identify the most fruitful paths for research. Currently, researcher that have the most-cited or most popular studies in “high-impact” journals get rewarded — even though the methods may not be reproducible. In a recent piece in Issues in Science and Medicine, former Lown Institute VP Shannon Brownlee and National Institutes of Health researcher Bibiana Bielekova propose using machine learning algorithms to identify articles by their rigor, reproducibility, and societal impact, rather than their popularity or number of citations. This could incentivize better research methods while illuminating future research paths that are most likely to bear fruit.

Reproducibility is a concerning issue in science, but these new initiatives also give us something to be optimistic about. Now go out there, and do the same thing over again!