## Statistical Evidence of Causality

We have seen that a statistic gives us numerical information about a class, such as the total number of its members or their average values on a variable.

Statistics can also tell us about correlations among these numerical properties.

A correlation can take many forms, depending on the type of statistic involved.

Example:

Average income correlates with the amount of education people have.

The frequency of lung cancer is higher among smokers than among nonsmokers.

Total government revenues from the capital gains tax have increased as the tax rate has gone down.

What these examples have in common, what makes them examples of correlation, is a systematic, nonrandom relationship between two variables: income and education, smoking and lung cancer, revenues and tax rates.

Correlations are important because they can give us evidence of causality. In a complex system, a given effect is often the result of a great many factors--none of which by itself is either necessary or sufficient. However, a given factor can be a partial or contributing factor, something that increases the likelihood of the effect, something that weighs in the balance-- and can tip the balance if the right combination of other factors is also present.

In general, a contributing factor usually can't be identified by looking at individual cases, but it reveals itself in the existence of a correlation among variables in the relevant class.

The existence of a correlation, however, does not prove causality--not by itself. A correlation may occur by chance, or it may reflect a causal relationship quite different from the one it suggests.

The rules for evaluating statistical evidence of causality rest on the same basic principle as Mill's method, just as drawing a statistical generalization from a sample is governed by the same basic principle as universal generalizations.

We are now, though, comparing groups instead of individual cases, as we did when we studied Mill's methods earlier. We take two groups that are identical except that one (the experimental group) has the property we're testing, while the other (the control group) does not. The property that we're testing is called the independent variable, and the effect is the dependent variable.

We have to use groups when a factor is only a contributing factor, for the reason explained above. We also have to make an adjustment in the way we measure the dependent variable, the effect. The question is not whether the effect occurs, or to what degree, in a particular case; we are not comparing groups. The question is whether a factor makes a statistical difference in the effect.

Finding two groups whose members are all identical is out of the question. Fortunately, that is not necessary.

Example:

Suppose you want to know whether a certain cram course can raise people's SAT scores.

Because we are dealing with groups, what matters is that they have the same distribution on those variables--the same distribution by verbal ability, memory, and so forth. In that case, the experimental and control groups are statistically identical except for the variable we are testing, and a statistical difference in the effect can then be attributed to that variable.

Comprehension Questions

 1. Over 80% of violent criminals have watched a violent TV show before they committed their act of violence. Thus, violence on TV is the cause of violent crime. a) There are confounding variables. b) It unclear which variable is the cause and which the effect. c) It is unreasonable to generalize from the sample actually studied d) The variables actually measured are not good stand-ins for the 2. When a local bank offered to give every student with a 1000 on his/her SAT \$1000, the SAT scores increased dramatically. Thus, the bank's offer was the cause of the increase in the SAT scores. a) There are confounding variables. b) It unclear which variable is the cause and which the effect. c) It is unreasonable to generalize from the sample actually studied d) The variables actually measured are not good stand-ins for the 3. The Church of the Grand Enlightenment has had its whole congregation praying for world peace for over a month and the incidents of international conflict have decreased. a) There are confounding variables. b) It unclear which variable is the cause and which the effect. c) It is unreasonable to generalize from the sample actually studied d) The variables actually measured are not good stand-ins for the 4. Apparently women do not have the ability to be orchestra conductors since only a small percentage of orchestra conductors are women. a) There are confounding variables. b) It unclear which variable is the cause and which the effect. c) It is unreasonable to generalize from the sample actually studied d) The variables actually measured are not good stand-ins for the 5. The incident of teenage pregnancy has steadily increased since the 1960s. So has the number of students required to take sex education classes. Thus, sex education has caused the increase in teen pregnancies. a) There are confounding variables. b) It unclear which variable is the cause and which the effect. c) It is unreasonable to generalize from the sample actually studied d) The variables actually measured are not good stand-ins for the 6. A new survey conducted from 1993-1996 using "NBC Nightly News with Tom Brochaw" found that TV news focused on crime more than ever with 5% of all news stories about crime. a) There are confounding variables. b) It unclear which variable is the cause and which the effect. c) It is unreasonable to generalize from the sample actually studied d) The variables actually measured are not good stand-ins for the 7. Americans have an increasing appetitie for violence as witnessed by the steadily increasing number of television sets being sold. a) There are confounding variables. b) It unclear which variable is the cause and which the effect. c) It is unreasonable to generalize from the sample actually studied d) The variables actually measured are not good stand-ins for the 8. The Stock Market suffered its second largest decline on record due to a loss of investor confidence. a) There are confounding variables. b) It unclear which variable is the cause and which the effect. c) It is unreasonable to generalize from the sample actually studied d) The variables actually measured are not good stand-ins for the 9. Companies are pouring billions into computers and other high-tech machinery that should make workers more efficient. Such improvements occur, though, in the service industries in which productivity is harder to measure. Yet, according to the Commerce Department, productivity rose just 0.6 percent in the spring quarter. a) There are confounding variables. b) It unclear which variable is the cause and which the effect. c) It is unreasonable to generalize from the sample actually studied d) The variables actually measured are not good stand-ins for the 10. Fairfax County found that 35 percent of the sprinklers failed to activate under 7 pounds of pressure. However, the manufacturer maintains that the 7 psi threshold for passing or failing does not reflect typical water pressure in sprinkler systems. a) There are confounding variables. b) It unclear which variable is the cause and which the effect. c) It is unreasonable to generalize from the sample actually studied d) The variables actually measured are not good stand-ins for the

Statistical significance | Observational studies |
Internal and external validity