In an earlier post on the priming effect, I mentioned a study by Doyen et al. that attempted to replicate earlier work by Bargh et al. that had found "participants for whom an elderly stereotype was primed walked more slowly down the hallway when leaving the experiment than did control participants, consistent with the content of that stereotype." In their study, Doyen et al. led half the experimenters "to think that participants would walk slower when primed congruently and the other half was led to expect the opposite." Only the subjects given instructions by the experimenters who understood that the test was to see if words associated with aging would prime subjects to mimic a stereotypical behavior of the aged showed the "walking speed effect."
Since most of us don't do experiments, my concern in this blog entry isn't to help experimenters design less biased experiments, but to help those of us who read accounts of those experiments either in scientific journals or in media accounts. What should we look for to determine whether experimenter bias has significantly affected the outcome of a study? And, when evaluating a journalist's account of a scientific study, are any hints of experimenter bias given?
Recently, I attended a conference sponsored by two skeptics' groups. The first speaker at the conference talked about the neurology of religious experiences. She brought up the work of Michael Persinger, a cognitive neuroscience researcher at Laurentian University in Sudbury, Ontario, Canada. She claimed (and so have many others) that Persinger has induced strange feelings--such as the "feeling of a presence" and other feelings sometimes described as "mystical" or "spiritual"--by sending low level magnetic pulses to the temporal lobes. He has his subjects put on a device that has been dubbed "the god helmet" while they sit alone in a darkened, silent room for 30-60 minutes. Persinger has been conducting these experiments over a period of at least fifteen years. He has tons of data and many published papers in peer-reviewed journals. However, I knew that Richard Dawkins had put on the god helmet and sat in the makeshift sensory deprivation chamber without feeling the presence of anything unusual except for the helmet on his head. Dawkins and others have speculated that Persinger's subjects are having experiences that are induced not by magnetic pulses to the temporal lobes but by the power of suggestion and expectation, and a desire for a weird experience. They know what the experiment is about; they long to experience something "spiritual," or Persinger suggests what they will experience and then they do, thanks to his suggestion.
When I suggested that we should be skeptical of Persinger's work because it may be suggestion not magnetic pulses to the temporal lobes that was inducing strange feelings, Dr. Sarah Strand proceeded to lay out the evidence in support of Persinger and against the alternative explanation. Unfortunately, the only evidence she supplied came from Persinger himself and his own analysis of his data. Persinger would not be the most disinterested, unbiased party in such an evaluation. Rather than describe experiments where subjects had no idea what to expect, where some were clearly expecting to experience something weird but Persinger gave them no magnetic pulses at all, or a host of other kinds of experiments that would have ruled out experimenter bias and clearly shown that it was the magnetic pulses that were causing the weird feelings, we were told that Persinger had done some sort of reanalysis of the data he's collected over the years.
Furthermore, what would have clearly ruled out experimenter bias would have been reference to other studies done in other labs by other researchers who had done double-blind, randomized controlled studies that clearly ruled out suggestion or some aspect of the quasi-sensory deprivation chamber experience as a cause and isolated the magnetic pulses as the most significant factor in the inducement of such things as the "feeling of a presence." Unfortunately, Dr. Strand didn't cite any other studies. Why? Because they don't exist.
The only other scientist who has tried to replicate Persinger's work was Pehr Granqvist of Uppsala University and his research team. In a double-blind, controlled study with 90 participants, they found that magnetic pulses had no discernible effect. They did find, however, many subjects from both groups claimed to have had strong religious experiences during the sessions. "Two out of the three participants in the Swedish study that reported strong spiritual experiences during the study belonged to the control group, as did 11 out of the 22 who reported subtle experiences." Persinger argued that the replication failed because the magnetic pulses had not been strong enough or given over a long enough period, which seems absurd given that so many subjects in both the control and experimental groups reported strong or subtle effects.
If it turns out that Granqvist's study is what other labs with no special interest in the outcome continue to find, then Michael Persinger has been deluding himself and others for fifteen years. He would not be the first Ph.D. to have done so. Nor would he be the first to have tainted his experiments with unintentional bias. (If the reader is wondering why Persinger would think stimulating the temporal lobes would induce a "spiritual" experience, it is probably because there have been many reports of those with temporal lobe epilepsy experiencing such things as "oneness with everything."* For more on "spiritual" experiences associated with temporal lobe epilepsy, see V.S. Ramachandran, Phantoms in the Brain, 1998.)
We should also note that it is common for some scientists and journalists to falsely and unjustly accuse other scientists of experimenter bias when the scientists' experiments contradict the accuser's beliefs. It does seem to be a fact that parapsychologists who are skeptical usually get negative results in their psi studies, while believers in psi often get positive results. One important exception is Susan Blackmore, who, while a true believer, continually got negative results and left parapsychology because of it. She's turned her attention to other matters, including trying to figure out why people believe in psi when the evidence for it is so flimsy. In any case, a skeptic (Richard Wiseman) and a true believer (Marilyn Schlitz) explored the experimenter effect while doing a joint study on the staring effect. In "Experimenter effects and the remote detection of staring" the authors describe their attempt to do a joint study on the staring effect:
Both authors of the present paper previously attempted to replicate this staring effect. The first author (R. W.) is a skeptic regarding the claims of parapsychology who wished to discover whether he could replicate the effect in his own laboratory. The second author (M. S.) is a psi proponent who has previously carried out many parapsychological studies, frequently obtaining positive findings. The staring experiments carried out by R. W. showed no evidence of psychic functioning (Wiseman & Smith, 1994; Wiseman, Smith, Freedman, Wasserman, & Hurst, 1995). M. S.'s study, on the other hand, yielded significant results (Schlitz & LaBerge, 1997).Even though the authors designed the experiments together, the skeptic got negative results and the psi proponent got positive results. They offer several possible explanations for the difference in their results. I encourage the reader to review their alternative explanations. One explanation they don't seem to consider is that the difference in results could have just been a fluke. More joint experiments by skeptics and psi proponents might resolve this issue, but there is so much hostility between parapsychologists and skeptics that cooperation like that of Schlitz and Wiseman is rare. In any case, the charge of experimenter bias should be ignored, whether made by a skeptic or a psi proponent, unless it is backed up by specific evidence that bias has likely occurred. Claims that the beliefs of skeptics and psi proponents affect the telepathic or precognitive abilities of subjects are pure speculation and beg the question.
In addition to the experiment biasing a study by unconsciously cuing or signaling a person or animal subject, others nearby may also be unintentionally providing information to the subject. One of the earliest scientific studies to recognize this was done by psychologist Oskar Pfungst while testing a horse that allegedly could understand German and do such amazing things as count or figure out what day it was. Pfungst recognized that the horse was responding to subtle physical cues (ideomotor reaction) that others had mistaken for understanding language and ability to do math. "Hans [the horse] was responding to a simple, involuntary postural adjustment by the questioner, which was his cue to start tapping, and an unconscious, almost imperceptible head movement, which was his cue to stop" (Hyman 1989: 425). Giving such inadvertent cues to a subject is now known as the Clever Hans effect.
I may have seen the Clever Hans effect in action whiles watching a television program about dogs who can allegedly detect cancer cells by smell. Dogs were led into a room with several dishes spread out on the floor. The dog would sniff here and there and eventually stop by one bowl, the bowl that had the cancer cells in it. I noticed that the investigator was in the room with the dog at all times. This made me wonder if some subtle, unintentional cue wasn't being given to the dog by the investigator. My suspicion was aroused because the investigator went right to the bowl the dog had picked and announced it was the right one. A blinded experiment would have been set up so that the investigator in the room would not know which bowl had the cancer cells. This would avoid any unintentional cuing.
Double-blind experiments that use control groups help experimenters avoid or reduce greatly the chances of an experimenter effect skewing the outcome of their study. Of course, double-blind experiments can be set up in biased ways, e.g., by not randomizing the assignment of subjects to the experimental and control groups or by using a sample that is too small even if it is randomized. And there are other ways that experimenters can bias a study without knowing it. For example, it might not occur to a psientist (a scientist who studies psychic phenomena) testing subjects for telepathy by having them guess what picture out of four a sender attempted to transmit during a test that he must make sure that the order of pictures is randomly presented to the receiver. Not only should the experimenter who is showing the pictures to the receiver not know which is the correct one, the experimenter responsible for putting the pictures in an envelope should make sure that the correct picture occurs in the one, two, three, and four positions at about the same rate. The experimenter who sets out the pictures for the receiver to review must present them in exactly the same way for each subject, e.g., the first picture goes to the upper left, the second to the upper right, the third to the lower left, etc. This would avoid any unintentional bias that might occur should it be the case that subjects are more likely to select, say the first picture or the upper right picture at a greater than chance rate.
Psientists must be especially sensitive to experimenter effects since any transfer of information by ordinary sensory means from experimenters to subjects contaminates any claim to extrasensory transmission of information. The problem in parapsychology is serious enough to have a name: sensory leakage, which Daryl Bem described as the most potentially fatal flaw in psi studies.
Experimenters can bias their studies in so many ways that it is probably not possible to conduct a perfect scientific study. There will always be something that could have been done better. In addition, there is the possibility of fraud or media distortion and manipulation by the experimenter. Also, some experimenters are simply incompetent.