What does this test do?


Running the test

We use this example from Wikipedia, where the handedness of men and women was compared:

48 woman and 52 men were asked if they are left- or right-handed.
Is the proportion of left-handed persons significantly bigger for one the sexes?
This is equivalent to the question if handedness is independent of sex. Therefore, this can be seen as a test of independence.

The numbers have to be arranged in a contingency table:

x <- matrix(c(43,9,44,4), ncol = 2, byrow = T)
colnames(x) <- c("Right-handed", "Left-handed")
rownames(x) <- c("Male", "Female")
x
##        Right-handed Left-handed
## Male             43           9
## Female           44           4

We have 9 out of 52 left-handed individuals among men, and 4 out of 48 left-handed individuals among women, i.e. the proportions are:

The proportions are in fact different, but keep in mind that we cannot see with the naked eye if they are significantly different. Significance is only reached when the numbers are big enough to exclude random differences caused by too small samples.

The test is conducted using the function fisher.test of the R stats-package:

fisher.test(x)
## 
##  Fisher's Exact Test for Count Data
## 
## data:  x
## p-value = 0.2392
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.09150811 1.71527769
## sample estimates:
## odds ratio 
##  0.4378606


Reading the output

The null hypothesis is that the proportions are the same, which is equivalent to the statement that the variables sex and handedness are independent.

The high p-value in the output above indicates that the null is not rejected, i.e. we cannot state that there is a significant difference regarding handedness between women and men.

The confidence interval refers to the odds ratio. It includes the true value with 95% chance (if the parameter conf.level in the function call is left unchanged). We see that 1 is included in the interval, confirming the conclusions drawn from looking at the p-value.


Remarks

Arrangement of the numbers in the table

Just if you wonder. The arrangement of the numbers in the contingency table does not mattter:

y = t(x)
y
##              Male Female
## Right-handed   43     44
## Left-handed     9      4
fisher.test(y)
## 
##  Fisher's Exact Test for Count Data
## 
## data:  y
## p-value = 0.2392
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
##  0.09150811 1.71527769
## sample estimates:
## odds ratio 
##  0.4378606


Odds ratio in the output

An estimate of the odds ratio is displayed in the output above.
Citation from the help page of the function: “Note that the conditional Maximum Likelihood Estimate (MLE) rather than the unconditional MLE (the sample odds ratio) is used.” That means you cannot reproduce the given odds ratio by just using the numbers in the contingency table.