Chi-Square Tests
Can use when you have a categorical independent variable and a categorical dependent variable.
Ho: There is no relationship between x and y. Chi-square = 0
H1: There is a relationship between x and y. Chi-square ≠ 0
*Fill in what x and y are in the above hypotheses.
*Can not have a directional hypothesis with a chi-square test. There is only a right tail on a chi-square distribution. And, a chi-square value can not tell you what the relationship is between two variables, only that a relationship exists between the two variables.
There are no distributional assumptions with chi-square.
See board for chi-square formula.
observed frequencies = frequencies from sample data
expected frequencies = if the null were true, these are the frequencies we would expect
If there is a large difference between observed and expected frequencies, the two variables are not likely independent. They are probably related.
If there are small or no differences between observed and expected frequencies, the two variables are likely independent. They are probably not related.
Chi-square ranges from 0-infinity. A 0 means the two variables are completely independent. No relationship whatsoever between the two variables.
critical chi-square = what chi-square would we expect if the null was correct
Degrees of freedom = (r-1)(c-1)
r = # of rows
c = # of columns
alpha level = set apriori
Chi-square is a cell by cell comparison of the expected and observed frequencies in a crosstabular table.
Example of a crosstabular table.
Preferred Family Size |
||
Support for Abortion | Large | Small |
Yes | # | # |
No | # | # |
What is the relationship between people's preference for family size (IV) and
their attitudes about abortion (DV)?
Ho: There is no relationship between people's preference for family size and whether they support abortion.
H1: There is a relationship between people's preference for family size and whether they support abortion.
If there was no relationship, what would you expect the crosstab to look like?
Preferred Family Size | ||
Support for Abortion | Large | Small |
Yes | 50% | 50% |
No | 50% | 50% |
Among the people who want large families, there are equal numbers (percentages) of people who are for and against abortion. The same is true among people who want small families.
If there was a relationship, what would you expect the crosstab to look like?
Preferred Family Size | ||
Support for Abortion | Large | Small |
Yes | 0% | 100% |
No | 100% | 0% |
None of the people who want large families support abortion. All the people who want small families support abortion.
Example 1.
What is the relationship between gender and fear of walking alone at night?
Ho: There is no relationship between gender and fear of walking alone at night.
H1: There is a relationship between gender and fear of walking alone at night.
x = men, women
fear = no, yes
alpha = .05
critical chi-square = 3.841 [df = (2-1)(2-1) = 1]
Draw Diagram. See Board.
Calculate chi-square.
Observed | Gender | ||
Fear of Walking Alone at Night | Men | Women | Totals |
No | 186/75% | 94/38.06% | 280/56.57% |
Yes | 62/25% | 153/61.94% | 215/43.43% |
248/100% | 247/100% | 495/100% |
*Always make IV the column variable. Always calculate percentages based on column totals.
Expected | Gender | ||
Fear of Walking Alone at Night | Men | Women | Totals |
No | 140.28 | 139.72 | 280 |
Yes | 107.72 | 107.28 | 215 |
248 | 247 | 495 |
fe = (each column marginal)(each row marginal)
n
Men/No = (280*248)/495 = 140.28
Men/Yes = (215*248)/495 = 107.72
Women/No = (280*247)/495 = 139.72
Women/Yes = (215*247)/495 = 107.28
fo | fe | fo-fe | (fo-fe)2 | (fo-fe)2 fe |
|
men/no | 186 | 140.28 | 45.72 | 2090.32 | 14.90 |
men/yes | 62 | 107.72 | -45.72 | 2090.32 | 19.40 |
women/no | 94 | 139.72 | -45.72 | 2090.32 | 14.96 |
women/yes | 153 | 107.28 | 45.72 | 2090.32 | 19.48 |
total | 68.74 |
Calculated chi-square = 68.74
Reject the null. There is a relationship between gender and fear of walking alone at night.
Go to observed table to interpret relationship.
Compare column %'s. Look down columns for patterns.
What do most of the men respond with? Not afraid
What do most of the women respond with? Afraid.
What is the pattern? Women are more likely to fear walking alone at night than are men.
Example 2.
What is the relationship between social class (IV) and perceived health (DV)?
x = low, middle, high social class
y = poor, fair, good health
Ho: There is no relationship between social class and health.
H1: There is a relationship between social class and health.
Alpha = .05
Degrees of freedom = (3-1)(3-1) = 4
critical chi-square = 9.488
Make diagram. See board.
Below is a crosstab of the relationship between social class and health from a national sample.
Observed | Social Class | |||
Perceived Health | Low | Middle | High | Total |
Poor | 15/39% | 31/12% | 18/9% | 64/13% |
Fair | 14/36% | 114/45% | 57/28% | 185/37% |
Good | 10/25% | 109/43% | 127/63% | 246/50% |
Totals | 39/100% | 254/100% | 202/100% | 495/100% |
*Always make the IV the column variable. And always calculate percentages based out of the column totals.
Need to calculate the expected frequencies based on if the null was correct.
fe = (each column marginal)(each row marginal)
n
Make an expected frequency table.
Expected |
Social Class |
|||
Perceived Health | Low | Middle | High | Total |
Poor | 5.04 | 32.84 | 26.12 | 64/13% |
Fair | 14.58 | 94.93 | 75.49 | 185/37% |
Good | 19.38 | 126.33 | 100.39 | 246/50% |
Totals | 39 | 254 | 202 | 495 |
low/poor = (64*39)/495 = 5.04
low/fair = (185*39)/495 = 14.58
low/good = (246*39)/495 = 19.38
middle/poor = (64*254)/495 = 32.84
middle/fair = (185*254)/495 = 94.93
middle/good = (246*254)/495 = 126.33
high/poor = (64*202)/495 = 26.12
high/fair = (185*202)/495 = 75.49
high/good = (246*202)/495 = 100.39
Now calculate chi-square
fo | fe | fo-fe | (fo-fe)2 | (fo-fe)2 fe |
|
low/poor | 15 | 5.04 | 9.96 | 99.2 | 19.68 |
low/fair | 14 | 14.58 | -.58 | .34 | .02 |
low/good | 10 | 19.38 | -9.38 | 87.98 | 4.54 |
middle/poor | 31 | 32.84 | -1.84 | 3.39 | .10 |
middle/fair | 114 | 94.93 | 19.07 | 363.66 | 3.83 |
middle/good | 109 | 126.23 | -17.23 | 296.87 | 2.35 |
high/poor | 18 | 26.12 | -8.12 | 65.93 | 2.52 |
high/fair | 57 | 75.49 | -18.49 | 341.88 | 4.53 |
high/good | 127 | 100.39 | 26.61 | 708.09 | 7.05 |
Total |
44.62 |
calculated chi-square = 44.62
Reject null. There is a relationship between social class and health.
Have to see observed table to interpret the relationship.
Compare column %'s. Look down columns for patterns.
What do most of the low class people respond with? Poor health
What do most of the middle class people respond with? Fair health
What do most of the high class people respond with? Good health
What is the pattern? As social class increases, people's perceived health improves. People in the high social class are most likely to perceive themselves in good health. People in the middle social class are most likely to perceive themselves in fair health. People in the lower social class are most likely to perceive themselves in poor health.
Limitations of chi-square
1. Heavily influenced by n. As n increases, chi-square increases, independent of the relationship between x and y.
If there is a large n, you will likely find statistically significant relationships when in reality there is no relationship.
If there is a small n, you will likely not find statistically significant relationships even if in reality there is a relationship.
2. Sensitive to small expected frequencies. If fe<5 in 1 or more cells, chi-square is unstable.