|Explanations & examples:
In a chi-squared-test it is tested whether two variables are independent or dependent of each other based on data from a sample or an observation. In the beginning of the test, a null hypothesis (H0) is stated, namely that the two variables involved are independent of each other. Or, phrased alternatively; that the two variables have no significant effect on each other. Then the expected values are calculated. The expected values are the values that the observed values would have been, if it had been the case that the null hypothesis was in fact true. If the differences between the observed values and the expected values are too big, the p-value will be below 0.05, and then the initial assumption that the null hypothesis is true must be false and H0 is therefore rejected. The p-value is the probability of getting the values in the observed data set under the assumption that H0 is actually true. If the p-value is under 5% (0.05) then it is extremely unlikely (namely less than 5% probable) to observe the values in the observation, given the assumption that H0 is true. Therefore the assumption that H0 is true is rejected in this case and the alternative hypothesis H1 is valid, namely that the two variables in question are dependent and influence each other.
To see the formulas used in the calculations of the chi-square statistic (χ2) and the corresponding p-value, please see the page formulas.
We want to investigate whether there is a connection between having a specific sex and preferring specific types of sports. In other words; whether the two variables "sex" and "preferred sport" are either independent of each other or influence each other. If the latter is the case, then information about a person's sex would increase the probability of that person preferring specific types of sports compared to if that person had belonged to the opposite sex.
In a high school there are 731 students, 369 girls and 362 boys. In a survey, the students were asked what their favourite sport is. The results figure in the following table:
We phrase the null hypothesis: The two variables "sex" and "preferred sport" are independent of each other. The null hypothesis can of course be phrased in more than one way. An alternativ, that states the same essense, would be; There is no connection between a person's sex and a person's favourite sport.
By entering this observed data into the table we perform the chi-squared test of independence between the two variables.
The p-value of the test is p = 0.0000 which is far below 0.05. And therefore we reject the null hypothesis. So we reject that the two variables "sex" and "preferred sport" are independent of each other. Therefore the opposite of that must be true, namely H1; that the two variables are dependent on each other, meaning that boys prefer specific types of sport more than girls and girls prefer specific types of sports more than boys. So a person's sex does have an influence on the probability of that person having a specific sport as his/her favourite in this fictive, made-up example. Note that the chi-square statistic (the number χ2) is larger than the critical value of chi-squared. This is the case, because the p-value is below 0.05, otherwise the χ2 number would have been smaller than the critical value. For an illustration of this, please see the page tables
Contributions to \( \chi^2 \)