Let's start with an example of some good causal reasoning. The claim that smoking causes lung cancer is based on data that demonstrate to a high degree of probability that had a person with lung cancer not smoked they would not have the kind of cancer they have today. We describe such a relationship as that of smoking to lung cancer as being of the type smoking is a necessary condition for lung cancer. Expressing it this way can be misleading though, so some explanation if required. Certainly, people who have never smoked can get lung cancer, so in that sense it is not necessary to smoke in order to get lung cancer. When we say that smoking is a necessary condition for lung cancer we mean that for the particular cancer a person has smoking was necessary. In simpler terms, this means that had the person not smoked they would not have gotten the kind of lung cancer that is caused by smoking.
How would one go about showing that smoking causes cancer, i.e., is a necessary condition for cancer? You begin by making predictions and testing them. For example, if smoking causes cancer then you would expect that if you randomly selected 100,000 people, divided them into smokers and non-smokers and observed them over a 20-year period, you should find a significantly greater number of lung cancer cases in the smoking group. Significance is measured by a statistical formula that basically says that it is highly unlikely that the difference between the two groups is due to chance. Many more predictions should be tested before jumping to the conclusion that smoking causes lung cancer. These predictions might take into account how long people have smoked; how many cigarettes a day they smoked; if they quit smoking, how long has it been; etc.
Establishing causality requires more than just finding a statistic that is consistent with the hypothesis that x causes y, however. For example, you might find that 30 out of 30 men in a West Virginia lung cancer ward all worked in the coal mines. From that you might infer that coal dust causes lung cancer. But what if you also find out that all 30 were smokers? That makes the issue a bit more complicated. Perhaps coal dust contributes to lung cancer or perhaps smoking alone can account for these cancers. Further predictions would have to be tested to try to tease out the role, if any, of coal dust in lung cancer.
A correlation between x and y must exist if x and y are causally related. That is, it must be the case that x is generally followed by y or that y is generally preceded by x, that as x increases or decreases so does y, or that as x increases, y decreases. Finding a correlation between x and y, however, does not mean they are causally related. There are very good correlations between age and shoe size, hat size, height, and weight, but age doesn't cause any of these things.
One problem with claiming a causal relationship based solely on a strong correlation is that in addition to a causal connection there are at least three other plausible explanations for the correlation. One, it could be a fluke. Two, there might be a causal relationship, but the correlation can't tell you which is the cause and which the effect. For example, you might find a good correlation between the increase in sex education classes for high school students and an increase in teenage pregnancies. But you have no way of knowing just from the correlation whether this is a coincidence, whether the sex ed classes stimulated interest in sexual activity leading to increased pregnancies, or whether the increase in pregnancies prompted school officials to add more sex education classes.
Three, there might be a causal relation involved but not x causes y or y causes x. Rather, it might be the case that z causes both x and y, or that z and x cause y (or z and y cause x). For example, there might be a good correlation between taking birth control pills and developing blood clots in women of a certain age group, but when controlled for smoking the correlation between taking birth control pills and developing blood clots might go away. In a study on this issue, it was found that smoking plus taking birth control pills increased the chances of blood clots more than just smoking did, while those who did not smoke but took the pill had no greater frequency of blood clots that women in general (Ronald Giere, Understanding Scientific Reasoning,  297-303).
On the other hand, if x and y are causally related there should be a good correlation between them, which allows us to make predictions that test the hypothesis x causes y. For example, if cell phone use causes brain tumors then we should find a significantly greater number of brain tumors among cell phone users compared to those who don't use cell phones. No data yet has supported this claim, which strongly indicates that cell phone use isn't a significant causal factor for brain tumors. However, had we found a strong correlation between cell phone use and brain tumors, that would confirm our hypothesis but it would not be enough evidence to establish a causal connection. Before jumping to the conclusion that cell phones cause brain tumors we would need to do two things: one, we must rule out other plausible causes for the correlation and two, we would need to make more testable predictions that ideally would be more rigorous than the original test.
Many people consider the randomized control group study to be the gold standard in science, especially in medicine. But since there are many variables that can affect the outcome of a controlled study, it is important that we not put too much faith in a single study, especially if the study is small and involves multiple outcomes. For example, the Sicher-Targ distant healing study involved only 40 subjects and looked at 23 possible outcomes for AIDS patients, some of whom were prayed for and some of whom weren't prayed for by a special group of praying people. Finding a significant correlation between a few of the 23 possible outcomes, even for a small experimental group of 20 subjects, is expected by the laws of chance and should not be taken to indicate any kind of causal relationship between the praying and the few outcomes that correlated significantly.
Probably the most common causal error is to conclude that because test results confirm a hypothesis, the hypothesis is established to a significant degree of probability. As long as other plausible explanations can't be ruled out, finding that an experimental result is just what you predicted if your hypothesis is correct means only that your hypothesis can't be ruled out. A prime example of this kind of faulty reasoning permeates more than a century of experimentation by parapsychologists in their quest to prove ESP and psychokinesis. I have discussed this elsewhere under the heading of the psi assumption.
Briefly, the psi assumption is the assumption that any significant departure from the laws of chance in a test of psychic ability is evidence that something anomalous or paranormal has occurred. Departure from the laws of chance would be consistent with the psi hypothesis, but until all other plausible explanations have been ruled out, it is hasty to conclude that evidence for psi has been found. There are several plausible explanations for the data in psi experiments. Cheating by subjects is commonplace. Fraud by experimenters is rare, but it has happened (e.g., the Soal-Goldney experiment [1941-1943]). Methodological errors and sloppiness have occurred in experiments that have been hailed as slam-dunk proof by parapsychologists like Dean Radin. For example, Susan Blackmore was appalled when she visited the lab of Carl Sargent, whose work played a major role in the ganzfeld studies of Bem and Honorton.
....I went to visit Sargent's laboratory in Cambridge where some of the best ganzfeld results were then being obtained. Note that in Honorton's database nine of the twenty-eight experiments came from Sargent's lab. What I found there had a profound effect on my confidence in the whole field and in published claims of successful experiments.
These experiments, which looked so beautifully designed in print, were in fact open to fraud or error in several ways, and indeed I detected several errors and failures to follow the protocol while I was there. I concluded that the published papers gave an unfair impression of the experiments and that the results could not be relied upon as evidence for psi. (Blackmore 1987)
Other errors, such a sensory leakage and experimenter effects, questionable methodologies such as displacement and psi missing, and misapplication of statistics must all be considered before jumping to the conclusion that a statistic that is unlikely due to chance according to some arbitrary formula is proof of anything paranormal.
Another area where it is common to mistake correlation for causation is in medicine. Two examples stand out: both involve using MRIs. When the MRI became widely available in the late 1980s, doctors began using them on patients complaining of severe back pain. The MRIs showed many things, including spinal discs that appeared degenerated in people with severe back pain. Doctors concluded that the pain was being caused by the abnormalities in the discs. Prior to the use of the MRI, the most common treatment for back pain was no treatment at all. Most back pain goes away on its own. After the introduction of the MRI, various kinds of surgical procedures were done to alleviate the pain assumed to be caused by the bulging or herniated disc. Before jumping to the conclusion that the abnormal discs were causing the back pain, scientists should have done MRIs on people who don't have any back pain. This was finally done in 1994 on ninety-eight people. They went to the doctor, got an MRI, and two-thirds of them had abnormalities in their discs. Maybe the bulging discs had nothing to do with the pain (Lehrer, Jonah. How We Decide, pp. 160-163). MRIs, it turns out, "find abnormalities in everybody" (How Doctors Think, Jerome Groopman).
The other example involves the overuse of MRIs for injured athletes. Baseball pitchers with injured arms, for example, are often advised to have surgery based on an MRI that finds abnormalities. Dr. James Andrews, a widely known sports medicine orthopedist in Gulf Breeze, Fla., wanted to test his suspicion that MRIs might be misleading. He scanned the shoulders of 31 perfectly healthy professional baseball pitchers. The pitchers were not injured and had no pain. But the MRIs found abnormal shoulder cartilage in 90 percent of them and abnormal rotator cuff tendons in 87 percent. This finding strongly indicates that the abnormalities seen in the MRIs were not causing the pitchers' pain.
Epidemiologists are particularly prone to making hasty judgments about causal connections based on correlations. An epidemiologist describes the problem:
...there is an ongoing national study of a chemical in plastic that is in EVERYTHING and EVERYONE. It is ubiquitous and it is a chemical, so something bad may be happening to us. So epidemiologists are asking what diseases might be associated with this chemical. And true to form, they are looking at every disease possible…exploring hundreds of diseases known to man to see if there is a statistical link to the chemical. Even before they start, we know they will find one or many links. If one disease is found that appears to be linked and the calculated statistic of p < or = to 0.05 is present, then the press release goes out that the chemical causes some horrible cancer or some toes to fall off. Everyone panics and goes to REI to buy chemical free plastic water bottles. Of course, the call will be for further studies. Those studies show the findings can’t be replicated. But those follow-up negative studies don’t get published and only make page 15 in the newspaper rather than page one. AND the damage is already done as the plastic industry tanks, the toe falling off attorneys are lining up and everyone starts limping for effect. ("Problems in Epidemiology land" by Dr. Ward Robinson)
Finally, there are some parapsychologists who have found correlations between what they consider to be important events that catch the attention of many people (such as the death of a princess) and blips on random number generators (RNGs). These folks call their work the global consciousness project and they are led by people like Dean Radin and Roger Nelson. Radin and Nelson think that when groups of people focus their minds on the same thing, they influence “the world at large” and this influence is shown by blips on RNG machines. In addition to basing a causal connection on nothing but a correlation, these fellows are guilty of selection bias, a problem that plagues epidemiologists as well.