Studies of scientific bias targeting the right problems
The biggest single source of bias across all fields of science comes from so-called small-study effects, Stanford researchers report.
In all fields of science, small studies, early studies and highly cited studies consistently overestimate effect size, according to a study led by researchers at the Stanford University School of Medicine.
A scientist’s early career status, isolation from other researchers and involvement in misconduct also appear to be risk factors for unreliable results, the research team reported.
A paper describing the work was published online March 20 in the Proceedings of the National Academy of Sciences. The lead author is Stanford senior research scientist Daniele Fanelli, PhD, and the senior author is John Ioannidis, MD, DSc, professor of medicine and of health research and policy.
Virtually all scientific work may be afflicted by some kind of bias, Ioannidis said. But how common different kinds of bias are, what factors cause bias, which kinds of bias are most common in different disciplines and how bias can be reduced are all questions being examined by researchers who study how science is done.
“I think that this is a mapping exercise,” said Ioannidis. “It maps all the main biases that have been proposed across all 22 scientific disciplines. Now we have a map for each scientific discipline, which biases are important and which have a bigger impact, and therefore scientists can think about where do they want to go next with their field.”
To show which sources of scientific bias were most common, the researchers reviewed more than 3,000 meta analyses that included nearly 50,000 individual research studies across 22 scientific fields.
“Our study tested with much greater accuracy than before several hypotheses about the prevalence and causes of bias,” said Fanelli. “Our results send a reassuring message, but only in part.”
Types of bias
The researchers examined seven hypothesized kinds of scientific bias:
- Small-study effect: when studies with small sample sizes report large effect sizes.
- Gray literature bias: the tendency of smaller or statistically insignificant effects to be reported in PhD theses, conference proceedings or personal communications rather than in peer-reviewed literature.
- Early-extremes effect: when extreme or controversial findings are published early just because they are astonishing.
- Decline effect: when reports of extreme effects are followed by subsequent reports of reduced effects.
- Citation bias: the larger the effect size, the more likely the study will be cited.
- United States-effect: when U.S. researchers overestimate effect sizes.
- Industry bias: when industry sponsorship and affiliation affect the direction and size of reported effects.
The Stanford team also looked at factors that have been hypothesized to increase the risk of bias, such as size and types of collaborations, the gender of the researchers or pressure to publish.
By far, the greatest bias came from small studies, while other sources of bias had relatively small effects. The team also found that small and highly cited studies and those in peer-reviewed journals seemed more likely to overestimate effects; U.S. studies and early studies seemed to report more-extreme effects; early-career researchers and researchers working in small or long-distance collaborations were more likely to overestimate effect sizes; and, not surprisingly, researchers with a history of misconduct tended to overestimate effect sizes.
On the other hand, studies by highly cited authors who published frequently were not more affected by bias than average. Research by men was no more likely to show bias than that of women. And scientists in countries with very strong incentives to publish, such as the United States, didn’t seem to have more bias than studies from countries where the pressure was less. These results confirm, with much greater accuracy, previous studies on retractions and corrections and studies using more indirect proxies of bias.
“A country that has incentives to publish more may also have other features that make its science better,” Ioannidis said.
Each kind of bias may result from a variety of mechanisms. For example, in small studies, it’s easier to get “statistical significance” if the effect size is large. Since studies with statistically significant results are more likely to be published, it follows that small studies with large effects are also more likely to be published. Even independent of statistical significance, larger effects are more likely to be published than smaller effects.
Ioannidis said that in the data they examined, the influence of different kinds of bias changed over time and seemed to depend on the individual scientist. “We show that some of the patterns and risk factors seem to be getting worse in intensity over time,” he said. “This is particularly driven by the social sciences, so if you broke scientific fields into big bins of biology, medicine, physical sciences and social sciences, it seems that the social sciences are seeing the more prominent worsening of these biases over time.”
One of the most unexpected findings of the study, Fanelli said, was that the kinds and amounts of bias were very irregularly distributed across the literature. “Although bias may be worryingly high in specific research areas, it is nonexistent in many others,” he said. “So bias does not undermine the scientific enterprise as a whole.”
Another finding of the study is that the relative magnitude of biases closely reflects the level of attention that they receive in the literature. That is, the kinds of biases researchers are most concerned about are in fact the ones they should be concerned by. “Our understanding of bias is improving, and our priorities are set on the right targets,” said Fanelli.
But that’s no cause for complacency, he said. “We perhaps understand bias better, but we are far from having rid science of it. Indeed, our results suggest that the challenge might be greater than many think because interventions might need to be tailored to the needs and problems of individual disciplines of fields. One-size-fits all solutions are unlikely to work.”
Solutions and interventions
Ioannidis likewise cautioned that the data are purely observational, not experimental, and the question of how to reduce bias is far from clear. For example, he said, just because small studies tend to give exaggerated results doesn’t mean we should stop doing them. “One might say immediately, well, we need to do large studies,” he said. “That would be an intervention. But you can’t necessarily translate an association directly into an effective intervention.
This has to be a grass-roots movement. It has to be something that scientists believe is good for their science to do.
“I think that one can take each one of these biases and say, ‘Well, let’s try to reduce it or eliminate it,’” he added. “Some of them are easier to reduce or eliminate, but then we have to see what that does to the wider scientific literature.”
That said, Ioannidis said he believes that many fields would benefit from having both larger studies and the involvement of numerous authors who are close enough to monitor one another’s work. “But that’s something that is an extrapolation,” he said.
Making science better is quite possible, he said. “Physics, for example, at some point decided that they’ve had enough of these tiny studies done by small teams, and they went for a multi-team collaborative model,” he said. “That changed the entire paradigm of how they do research. It probably largely removed the problem of small-study bias.”
When physicists work together on huge projects, said Ioannidis, “you don’t have the problem of these thousands and tens of thousands of physicists, each one of them running their tiny experiment, and then just waiting to collate the results after the fact and trying to make sense of them.”
Since each field of science has a slightly different profile of biases, it makes sense, he said, for each one to choose the best ways to reduce the biases afflicting their field.
“This has to be a grass-roots movement,” Ioannidis said. “It has to be something that scientists believe is good for their science to do. Top-down approaches, such as institutions and funding agencies trying to promote best practices, could also help, but it has to be an agreement, and an agreement among all these stakeholders. And obviously, scientists need to believe that this is something that will help the results and their science to be more reliable.”
A researcher at Leiden University also co-authored the paper. This research was entirely supported by the U.S. Office of Research Integrity.
Stanford Medicine integrates research, medical education and health care at its three institutions - Stanford University School of Medicine, Stanford Health Care (formerly Stanford Hospital & Clinics), and Lucile Packard Children's Hospital Stanford. For more information, please visit the Office of Communication & Public Affairs site at http://mednews.stanford.edu.