Many statistical parameters were created to characterize measured values, such as quality of data, goodness of fits, significance of correlations, etc. But a few apply not to measurements themselves, but to the frequency of their occurrence. The chi-squared statistic is one.

Chi-squared is used to help us decide the extent to which two discrete frequency distributions are the same. Often one distribution is measured and the other is theoretically expected. Then chi-squared is defined by [1, 2]

χ² =
Σ_{i}M
(N_{i} – n_{i})²/n_{i}

(1)

where M is the number of possible outcomes for one event, N_{i} is the number of times we observe the particular outcome i, and n_{i} is the number of times we expect outcome i to be observed.

If the event is to flip a coin, then there are two possible outcomes, so M = 2. However, if the event is to throw a die, then M = 6. We obtain values for the N_{i} in (1) by initiating a total of K events and counting the number of times each outcome i is observed. Values for the expected numbers n_{i} are usually obtained by estimating the probability p_{i} for observing outcome i and computing the expectation value from

n_{i} = p_{i} K

(2)

and then (1) can be written as

χ² =
Σ_{i}M(N_{i} – p_{i} K)²/(p_{i} K)

(3)

In most situations the M possible outcomes are mutually exclusive, then the probabilities must sum to unity,

Σ_{i}Mp_{i} = 1

(4)

and

Σ_{i}MN_{i} = K

(5)

so (3) reduces to

χ² = (Σ_{i}M(N_{i})²/(p_{i} K)) – K

(6)

When χ² = 0, then the observed distribution coincides with the expected one. Otherwise, χ² > 0 and it is not bounded from above. Usually, we want χ² small. It is an appropriate measure under these conditions [2]:

- The measured and expected distributions are discrete, not continuous.
- Each event is independent of all other events.
- The number of observed events is large; this is commonly interpreted to mean that the smallest number of observed events satisfies N
_{i}> 5.

We have two coins; one is a fair coin, so flipping it gives heads or tails with equal probability (p = 0.5). But the other coin is weighted so that flipping it gives heads with probability p = 0.8. We cannot visually distinguish between the two, but we need to identify the biased coin. To do so, we pick one coin, flip it fifty times, and record the number of heads and tails we observe. With these data we compute χ² twice: once assuming the coin is fair, a second time assuming it is biased. We use the smaller value of χ² to decide whether the coin is fair or biased.

For each coin, there are only two possible outcomes: either a head or a tail. So the sum in (3) has only two terms. For the fair coin with p = 0.5, (3) simplifies to

χ² = 2(N_{h} – 25)²/25

(7)

where N_{h} is the number of heads we observe in fifty flips of the coin. For the biased coin with p = 0.8, (3) simplifies to

χ² = (N_{h} – 40)²/8

(8)

Table 1 gives some possible results for fifty flips of one coin. The table indicates that if we flip one of the coins 50 times and get more than 33 heads, then that coin is probably the biased one; however, if we get 33 or fewer heads, then that coin is probably the fair one. We do not consider the possibilities N_{h} < 5 or N_{h} > 45 (hence N_{tails} < 5), because those situations would violate the third rule-of-thumb under (6).

Use of chi-squared to decide which of two coins is biased. We flip one coin 50 times and base our conclusion on the number of heads N

N_{h} |
χ² (p=0.5) Eq. (7) |
χ² (p=0.8) Eq. (8) |
Conclusion |
---|---|---|---|

45 | 32 | 3.1 | biased coin |

40 | 18 | 0 | biased coin |

35 | 8 | 3.1 | biased coin |

33 | 5.1 | 6.1 | fair coin |

30 | 2 | 12.5 | fair coin |

25 | 0 | 28 | fair coin |

20 | 2 | 50 | fair coin |

15 | 8 | 78 | fair coin |

10 | 18 | 112 | fair coin |

5 | 32 | 153 | fair coin |

If we generalize the above analysis so that K is any number of flips of one coin, then we find that whenever

N_{h} > 2K/3

(9)

then the coin is probably biased. (We caution that (9) applies only when one coin is fair and the other has an 80% probability of showing heads.)

The results in the Table 1 do not guarantee that we would correctly identify the coin; all we can say is that our conclusion is more likely than the alternative, although the larger the difference between the two chi-squared values, the more likely our conclusion would be. However, we can quantify how much "more likely" our conclusion is relative to the alternative. This can be done because the distribution for chi-square itself is known, when the number of events studied is very large.

The χ² distribution depends on the number of "degrees of freedom, ν" that are available in the problem. The value of ν is the number of possible outcomes M minus any constraints imposed; often the only constraint is that the probabilities for all outcomes must sum to unity, as in (4). Then [2]

ν = M – 1

(10)

But in some situations additional constraints apply; for example, we may know the value for the mean of the sampled distribution. In any case, ν decreases by unity for each constraint.

It is conventional to characterize the χ² distribution in terms of its percentiles for particular numbers of degrees of freedom. A sample is given in Table 2. The body of the table contains values of χ² for particular values of ν and particular percentiles p. A value of the percentile p represents the fractional area under the distribution up to a certain value of χ², as in Figure 1. Then, for example, the 20th percentile (p=0.2) for a problem having ν = 1 and χ² = 0.064 implies that in 20 experiments out of 100, we expect the observed value for χ² will be less than 0.064; hence, in the other 80 experiments, it should be greater than 0.064.

p | ν = 1 | ν = 2 | ν = 3 | ν = 4 | ν = 6 | ν = 8 | ν = 10 |
---|---|---|---|---|---|---|---|

0.01 | 0.000 | 0.020 | 0.115 | 0.297 | 0.872 | 1.65 | 2.56 |

0.05 | 0.004 | 0.103 | 0.352 | 0.711 | 1.64 | 2.73 | 3.94 |

0.1 | 0.016 | 0.211 | 0.584 | 1.06 | 2.20 | 3.49 | 4.87 |

0.2 | 0.064 | 0.446 | 1.00 | 1.65 | 3.07 | 4.59 | 6.18 |

0.3 | 0.148 | 0.713 | 1.42 | 2.20 | 3.83 | 5.53 | 7.27 |

0.5 | 0.455 | 1.39 | 2.37 | 3.36 | 5.35 | 7.34 | 9.34 |

0.7 | 1.07 | 2.41 | 3.66 | 4.88 | 7.23 | 9.52 | 11.8 |

0.8 | 1.64 | 3.22 | 4.64 | 5.99 | 8.56 | 11.0 | 13.4 |

0.9 | 2.71 | 4.61 | 6.25 | 7.78 | 10.6 | 13.4 | 16.0 |

0.95 | 3.84 | 5.99 | 7.81 | 9.49 | 12.6 | 15.5 | 18.3 |

0.99 | 6.63 | 9.21 | 11.3 | 13.3 | 16.8 | 20.1 | 23.2 |

Let us apply the χ² distribution in Table 2 to our problem of identifying which of two coins is biased. Recall we pick one coin at random and flip it 50 times. Say we obtain heads 35 times. Then from Table 1 we conclude that this coin is the biased one. Now we ask, how confident should we be that this conclusion is correct?

In this situation we have only two possible outcomes to flipping the coin, and only one constraint (4), so (10) gives the number of degrees of freedom as ν = 1. If the flipped coin were the fair one, then for N_{h} = 35, Table 1 gives χ² = 8. From the χ² distribution in Table 2 with ν = 1, we see that the 99^{th} percentile has χ² = 6.6. That is, the probability is only 1% that we would observe χ² > 6.6 (whereas we actually obtained χ² = 8).

But if the tested coin were the biased one, then Table 1 gives χ² = 3.12, and from Table 2 with ν = 1, we see that the 90^{th} percentile has χ² = 2.71. That is, the probability is 10% that we would find χ² > 2.71 (whereas we actually obtained χ² = 3.12). Comparing 10% with 1% gives us some confidence that if a coin gives heads 35 times out of 50, then that coin is the biased one.

We caution that when the number of sampled events K is not very large, the chi-squared distribution in Table 2 is only a rough approximation. In fact, Knuth [2] shows that often the values in Table 2 are reliable to only one significant figure.

[1] W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, *Numerical Recipes*, Cambridge University Press, New York, 1986, p. 470f.

[2] D. E. Knuth, *The Art of Computer Programming*, vol. 2, "Seminumerical Algorithms", 2nd ed., Addison-Wesley, Reading, MA, 1981, p. 39f.

[3] B. W. Lindgren and G. W. McElrath, *Introduction to Probability and Statistics*, Macmillan, New York, 1959, p. 256.