The Chi-square test for independence (a.k.a. χ-square test or Pearson’s chi-square test of association) comes in handy when you need to compare two categorical variables and when the dataset is made of counts. Often this dataset will look like a “contingency table“, something like this:

Food A Food B Food C
male count 1A count 1B count 1C
female count 2A count 2B count 2C

Of course, the nature of these variables will vary. Sometimes there will be only 2 “contingencies” per variable and your dataset will be limited to a 2×2 table, sometimes one or both of the variables will have many more contingencies and the complexity of your dataset will increase accordingly.

Regardless of the number of rows, columns and cells, the goal of the test is often one of these two:

Note: the result of the Chi-square test might be unreliable if the sample is small (below 10, some say below 5…). One may thus use Fisher’s exact test instead for such small samples. Anyway, Fisher’s exact test appears to be valid for all sample sizes.

Lets take an example. We test 3 different types of food (A, B and C) on male and female dogs and note the preference of each individual. We want to know whether there is a food preference that depends on gender. Let’s look at the data:

##        Food A Food B Food C
## male       45     78     11
## female     63     79      8

Let’s proceed with the Chi-square test (where the null hypothesis H0 is that food preference is independent of gender) for which the function is chisq.test():

##  Pearson's Chi-squared test
## data:  experiment
## X-squared = 2.5869, df = 2, p-value = 0.2743

The obtained p-value is above 0.05. The null hypothesis H0 is thus accepted: there is no gender-dependent food preference.

Should you have a preference for Fisher’s exact test, the function is fisher.test():

##  Fisher's Exact Test for Count Data
## data:  experiment
## p-value = 0.2622
## alternative hypothesis: two.sided

and the conclusion is the same as for the Chi-square test, in the present case.