function (next
We use this example from Wikipedia, where the handedness of men and women was compared:
48 woman and 52 men were asked if they are left- or
Is the proportion of left-handed persons
significantly bigger for one the sexes?
This is equivalent to the question if handedness is independent of sex.
Therefore, this can be seen as a test of independence.
The numbers have to be arranged in a contingency table:
x <- matrix(c(43,9,44,4), ncol = 2, byrow = T)
colnames(x) <- c("Right-handed", "Left-handed")
rownames(x) <- c("Male", "Female")
## Right-handed Left-handed
## Male 43 9
## Female 44 4
We have 9 out of 52 left-handed individuals among men, and 4 out of 48 left-handed individuals among women, i.e. the proportions are:
The proportions are in fact different, but keep in mind that we cannot see with the naked eye if they are significantly different. Significance is only reached when the numbers are big enough to exclude random differences caused by too small samples.
The test is conducted using the function fisher.test of the R stats-package:
## Fisher's Exact Test for Count Data
## data: x
## p-value = 0.2392
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
## 0.09150811 1.71527769
## sample estimates:
## odds ratio
## 0.4378606
The null hypothesis is that the proportions are the same, which is equivalent to the statement that the variables sex and handedness are independent.
The high p-value in the output above indicates that the null is not rejected, i.e. we cannot state that there is a significant difference regarding handedness between women and men.
The confidence interval refers to the odds ratio. It includes the true value with 95% chance (if the parameter conf.level in the function call is left unchanged). We see that 1 is included in the interval, confirming the conclusions drawn from looking at the p-value.
Just if you wonder. The arrangement of the numbers in the contingency table does not mattter:
y = t(x)
## Male Female
## Right-handed 43 44
## Left-handed 9 4
## Fisher's Exact Test for Count Data
## data: y
## p-value = 0.2392
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
## 0.09150811 1.71527769
## sample estimates:
## odds ratio
## 0.4378606
An estimate of the odds ratio is displayed in the output above.
Citation from the help
page of the function: “Note that the conditional Maximum
Likelihood Estimate (MLE) rather than the unconditional MLE (the sample
odds ratio) is used.” That means you cannot reproduce the given
odds ratio by just using the numbers in the contingency table.