Chi-Square Tests for Homogeneity and Independence

Plain M&M's candies come in six colors: orange, yellow, brown, green, blue and red. Do do peanut M&M's. But do both types of candies share the same distribution of those colors?

If we want to investigate whether the color distribution of peanut M&M's is the same as the color distribution of plain M&M's, our hypotheses would be:

H0: The colors of plain M&M candies and the colors of peanut M&M candies have the same distribution.

HA: The colors of plain M&M candies and the colors of peanut M&M candies do not have the same distribution.

We call this a test for homogeneity, because our null hypothesis is that the two populations are homogeneous (meaning "the same").

I purchased a king-size package of plain M&M's and counted the number of each color of candy: of 102 candies in the package, 11 were blue, 25 orange, 26 green, 8 yellow, 17 brown and 15 red. I also purchased a king-size bag of peanut M&M's and counted the number of each color: of 41 candies, 7 were orange, 3 yellow, 2 brown, 8 green, 16 blue and 5 red.

We could display this sample data (the observed counts for each color in each group) in a two-way table:

  plain peanut
blue 11 16
orange 25 7
green 26 8
yellow 8 3
brown 17 2
red 15 5

If the null hypothesis is true, then the color distributions would be the same if we were to look at all plain and peanut M&M's, even though we observe some variation between the samples.

Overall, we observed 11+16 = 27 blue candies out of 102+41 = 143 total candies, so `27/143 approx 0.1888`, or 18.88%, of the candies in our sample were blue. If the null hypothesis is true, we would then expect 18.88% of the 102 plain M&M's, or 102(0.1888) ≈ 19.259 of them, to be blue; likewise, we would expect 18.88% of the 41 peanut candies, or 41(0.1888) ≈ 7.74 candies, to be blue. Continuing like this, we could create a table of the expected counts for each color and each type:

  plain peanut
blue 19.259 7.741
orange 22.825 9.175
green 24.252 9.748
yellow 7.846 3.154
brown 13.552 5.448
red 14.266 5.734

Notice that none of these numbers of integers, but because these numbers are expected counts (in a mathematical model) rather than observed counts (in real life), that's OK. Notice also that the row totals and the column totals are the same as for the table of observed counts. Because of this fact, if we know the values highlighted in yellow above, we can compute the other entries in the table by knowing the row and column totals. We say that this table has 5 degrees of freedom (generally computed by multiplying (r-1)(c-1) where r is the number of row and c is the number of columns).

We now want to investigate the deviations between what we observed and what we expected, so we can compute the differences for each cell in these tables by subtracting the expected counts from the observed counts:

  plain peanut
blue -8.259 8.259
orange 2.175 -2.175
green 1.748 -1.748
yellow 0.154 -0.154
brown 3.448 -3.448
red 0.734 -0.734

We could add up (or average) these deviations, but we would end up with a sum (or average) or 0, no matter what the observed counts were. So instead we might think to square the deviations:

  plain peanut
blue 68.207 68.207
orange 4.730 4.730
green 3.056 3.056
yellow 0.024 0.024
brown 11.886 11.886
red 0.539 0.539

This has the effect of making all of the entries in the table positive, and it also makes large deviations appear even larger (and smaller ones appear smaller). It also has the effect of making the units "candies squared" instead of "candies," which we can fix by dividing each of these squared deviations by the corresponding expected count:

  plain peanut
blue 3.542 8.811
orange 0.207 0.516
green 0.126 0.314
yellow 0.003 0.008
brown 0.877 2.182
red 0.038 0.094

 If we add up all of the entries in this last table, we get a sum of 16.716. We call this value `chi^2` (or "chi-squared," the square of the Greek letter "chi"). The values of `chi^2` we get from computations like this follow a distribution that is unimodal and positively skewed. We can use this distribution to compute a P-value:

χ2cdf(16.716,1E99,5) ≈ 0.005

You can find χ2cdf in the DISTR menu on the TI-84.

Because the P-value is quite small, we reject the null hypothesis and conclude that the distribution of colors for plain M&M's is different from the distribution of colors for peanut M&M's.

This hypothesis test has a couple conditions.

Randomness: We want the trials to be randomly selected from the population(s). IN this example, we have cluster samples, but it may be reasonable to assume that the candies in these two bags are representative of all plain and peanut M&M's, respectively.

Expected Cells: We want the expected counts to be at least 5 (we can tolerate up to 20% of the counts being less than 5). In this example, we have one count out of 12 that is below 5, so we should be OK.

The calculator offers a much easier way to compute the P-value. Our data is displayed in a two-way table, so we will enter it into the calculator as a matrix. If you have a standard TI-83, press the MATRX button; if you have a TI-83 Plus or TI-84, press 2ND and x-1 to access the MATRIX menu. Move the cursor over to EDIT.

Now press ENTER. Type 6 and ENTER (since there are 6 rows in our two-way table) then 2 and ENTER (since there are 2 columns). Now enter the data by typing each count (11, 16, 25, 7, etc.) and pressing ENTER after each new number.

Now press QUIT, then STAT, move the cursor over to TESTS, move down to χ2–Test.

Press ENTER. The default setting for the Observed matrix should be [A] and for the Expected matrix it should be [B]; leave these alone and move the cursor down to Calculate or Draw.

Press ENTER. The calculator should display the value of χ2 and the P-value along with a graph of the χ2 model or the number of degrees of freedom: (6-1)(2-1) = 5.

Before we continue with the hypothesis test, however, we need to check the expected counts; the calculator has stored these counts in matrix [B]. Go back to the MATRIX menu, move the cursor over to EDIT, then move the cursor down to [B].

Press ENTER and you will see the expected counts.

Independence
In the M&M's example above, we were comparing two separate groups (plain and peanut M&M's) and we gathered sample data from each group separately.

We can employ exactly the same mathematical techniques when we examine a potential relationship between two categorical variables. Earlier in this course, we investigate possible associations between categorical variables using mosaic plots and conditional probabilities. Here we use a hypothesis test.

The calculator method for the test for independence follows exactly the same steps as the test for homogeneity. The difference is that instead of having two (or more) samples compared via a single variable, a test for independence involves one sample and two variables.

Remember the Coke vs. Pepsi data discussed at length earlier during the course? Here's the data to refresh your memory:

  female male
Coke 7 9
Pepsi 10 4
neither 8 3

We have one sample (a group of statistics students) and two variables (beverage preference and gender) so a test for independence is appropriate here. While these students are not randomly selected they may at least be representative of all statistics students at EdCC. We'll wait to check the Expected Cell condition until we perform the mechanics of the test.

Our hypotheses are:

H0: Gender and beverage preference are independent.

HA: Gender and beverage preference are not independent.

Enter the Coke vs. Pepsi data into your calculator (following the steps listed above for the homogeneity test) and compute the P-value (and be sure to check that all the expected counts are at least 5).

Exercises

1. Refer to the Coke vs. Pepsi example above.

a) How many degrees of freedom are there in this test?

b) What's the smallest expect count?

c) Compute the P-value.

d) Look back at the discussion of mosaic plots: do you reach the same conclusion with the test for independence that we did earlier just looking at the mosaic plot?

2. [OIS 1.48] Views on immigration A SurveyUSA poll conducted January 27–29, 2012, interviewed 910 registered voters from Tampa, Florida, asking each respondent if they thought workers who have illegally entered the US should be (i) allowed to keep their jobs and apply for US citizenship, (ii) allowed to keep their jobs as temporary guest workers but not allowed to apply for US citizenship, or (iii) lose their jobs and be required to leave the country. The survey also asked each respondent to characterize their political ideology (conservative, moderate, liberal). The results of this survey appear in the table below:

                                                   political ideology
                                           Conservative Moderate Liberal
                 (i) Apply for citizenship           57      120     101
   immigration  (ii) Guest worker                   121      113      28
      response  (iii) Leave the country              179      126      45
                (iv) Not sure                        15        4       1

a) Which is appropriate here: a test for homogeneity or a test for independence?

b) How degrees of freedom are there in that test?

c) State the hypotheses for the test. 

d) Are the conditions for the test satisfied? Explain.

e) Compute the P-value for the test

f) State an appropriate conclusion.