*statistical*outcome. (You might consider a study that finds a strong correlation between not smoking cigarettes and not getting lung cancer to have a

*negative*outcome, but a study that finds a

*significant statistical correlation*--that is, a

*correlation not likely due to chance*--between not smoking and not getting lung cancer is a positive-outcome study.) Studies that find nothing of statistical significance or of possible causal consequence often don't get published. Because of various types of publication bias, the scientific community and the general public are often presented with a skewed and biased view of findings by scientific researchers.

Researchers who find nothing of statistical significance in small studies usually do not present their findings at scientific meetings nor do they submit their work to journals. Such behavior is known as the

*file-drawer effect*, since these studies get filed rather than submitted for publication. Large studies--whether observational or control group studies--will usually get published unless they have some obvious methodological flaw. Small studies are more susceptible to statistical flukes than large studies, other things being equal. The most common statistical formula used in the social sciences and medical studies considers a statistic significant if there is only a 5% chance of it being a fluke. Statistical significance does not mean that a statistic is

*important*; it means that according to some formula the statistic is not likely due to chance, i.e., not likely a fluke. The smaller the sample in a study, the greater the chance of finding statistical significance when a larger study would find nothing of statistical significance. Also, the smaller the study the greater the chance of missing a correlation that a larger study would find at some level of statistical signficance. The latter situation will occur with more frequency with small correlations. Again, statistical significance does not mean

*important*. It may be true that your study had a sample of 18,000 and found a statistically significant difference in heart attacks between subjects taking a dummy pill and subjects taking rosuvastatin (to lower cholesterol), but that doesn't mean the difference is important. (The difference was 0.2 events per 100 person years. What the side effects of taking the statin over a period of years might be is unknown, but they might outweigh the small benefit of taking it.)

Publishing large studies that have positive but unimportant outcomes is one way that scientific journals can bias the information the mass media filters for our consumption. Another way is to publish small studies with positive outcomes and expect journalists and the general public to recognize that nothing much should be made out of a single small study.

But even when scientists do submit work with negative findings--such as not finding any evidence for precognition--their work is often rejected simply because it is not positive. A recent example of this bias occurred with the

*Journal of Personality and Social Psychology*, a journal of the American Psychological Association (APA). In 2011, this journal published work by parapsychologist Daryl Bem [Vol 100(3), Mar 2011, 407-425] that purported to find positive evidence in support of precognition ("Feeling the future: Experimental evidence for anomalous retroactive influences on cognition and affect"). When scientists Stuart J. Ritchie, Richard Wiseman, and Christopher C. French submitted a paper that replicated the best of Bem's work but which resulted in no evidence for precognition, the journal refused to consider the research paper for publication. (A study is considered a 'replication' of another study if it replicates the

*methods*of that study, regardless of the results found in the new study.) They were told that the journal does not publish replications. We will never know what the journal would have done had the negative study been submitted first, but my guess is that it would not have been published because its results harmonize with what most psychologists take for granted: there is no precognition. (The failed replication was published online at PLoS ONE and is called "Failing the Future: Three Unsuccessful Attempts to Replicate Bem's ‘Retroactive Facilitation of Recall’ Effect.")

Mass media articles about scientific work often mislead the public because they do not report on negative-outcome studies. (Again, I remind the reader that a negative-outcome study is one that finds nothing of statistical significance.) Worse, many of the studies covered by the mass media are small studies that should not be generalized from. The most outrageous recent example of turning a small study into a major catastophe by the mass media is the Andrew Wakefield report on 12 children. Wakefield claimed he found a connection between the MMR vaccine and developmental disorders. This led to members of the anti-vaccination movement using this report to incite a panic regarding vaccines and autism. Because of the concern raised over vaccines and developmental disorders, several large studies were conducted and they all failed to find evidence of a correlation between vaccines and developmental disorders. These studies were published and widely publicized by the mass media. Nevertheless, the damage had been done, and the subsequent reports in both scientific journals and the mass media have done little to quell the panic. Also, rather than change their minds about vaccines, the anti-vaccinationists have found many reasons to reject the studies that show their position is wrongheaded, thereby exemplifying the backfire effect.

One of the ways in which positive-outcome bias skews our understanding of the results of scientific research is in how it affects

*systematic reviews*, such as those done by the Cochrane Collaboration. This large group of academics from around the world tries to examine all the scientific studies that have been done on a particular medical treatment, conventional or unconventional. Different studies are given different values depending on how they were designed, how large they were, etc. The group tries to determine in an unbiased way what the best evidence is for any particular treatment. But their work can be very misleading because often negative studies don't get published, which they admit:

Systematic reviews aim to find and assess for inclusionallhigh quality studies addressing the question of the review. But finding all studies is not always possible and we have no way of knowing what we have missed. Does it matter if we miss some of the studies? It will certainly matter if the studies we have failed to find differ systematically from the ones we have found. Not only will we have less information available than if we had all the studies, but we might come up with the wrong answer if the studies we have are unrepresentative of all those that have been done.

We have good reason to be concerned about this, as many researchers have shown that those studies with significant, positive, results are easier to find than those with non-significant or 'negative' results. The subsequent over-representation of positive studies in systematic reviews may mean that our reviews are biased toward a positive result.The value of scientific studies is often measured by how many scientists make reference to the study.

*Citation bias*, an inevitable consequence of positive-outcome bias, magnifies the skewing problem. The Cochrane Collaboration uses the funnel plot to estimate the signficance of positive-outcome bias in a systematic review. "It assumes that the largest studies will be near the average, and small studies will be spread on both sides of the average. Variation from this assumption can indicate publication bias." Of course, if large studies with negative results are stuck in the file drawer, the funnel plot will be misleading. This is not likely to happen as long as the studies are methodologically sound. But, some of these large studies may mislead us into thinking that some small difference, though statistically significant, is important when it isn't.

A study that tried to measure positive-outcome bias in peer review of a medical journal found small but signficant ways in which reviewers evaluated positive and negative studies.

## No comments:

## Post a Comment