Chi-Square Tests

Can use when you have a categorical independent variable and a categorical dependent variable.

Ho: There is no relationship between x and y.  Chi-square = 0

H1: There is a relationship between x and y.  Chi-square

*Fill in what x and y are in the above hypotheses.

*Can not have a directional hypothesis with a chi-square test.  There is only a right tail on a chi-square distribution.  And, a chi-square value can not tell you what the relationship is between two variables, only that a relationship exists between the two variables.

There are no distributional assumptions with chi-square.  

See board for chi-square formula.

 

observed frequencies = frequencies from sample data

expected frequencies = if the null were true, these are the frequencies we would expect

If there is a large difference between observed and expected frequencies, the two variables are not likely independent.  They are probably related.  

If there are small or no differences between observed and expected frequencies, the two variables are likely independent.  They are probably not related.  

Chi-square ranges from 0-infinity.  A 0 means the two variables are completely independent.  No relationship whatsoever between the two variables.

 

critical chi-square =    what chi-square would we expect if the null was correct

See chi-square critical table

Degrees of freedom = (r-1)(c-1)

r = # of rows

c = # of columns

alpha level = set apriori

 

Chi-square is a cell by cell comparison of the expected and observed frequencies in a crosstabular table.

Example of a crosstabular table.

 

Preferred Family Size

Support for Abortion Large Small
Yes # #
No # #


What is the relationship between people's preference for family size (IV) and their attitudes about abortion (DV)? 

Ho: There is no relationship between people's preference for family size and whether they support abortion. 

H1: There is a relationship between people's preference for family size and whether they support abortion. 


If there was no relationship, what would you expect the crosstab to look like?

  Preferred Family Size
Support for Abortion Large Small
Yes 50% 50%
No 50% 50%

Among the people who want large families, there are equal numbers (percentages) of people who are for and against abortion.  The same is true among people who want small families. 


If there was a relationship, what would you expect the crosstab to look like?

  Preferred Family Size
Support for Abortion Large Small
Yes 0% 100%
No 100% 0%

None of the people who want large families support abortion.  All the people who want small families support abortion.


Example 1.

What is the relationship between gender and fear of walking alone at night?

Ho: There is no relationship between gender and fear of walking alone at night.

H1:  There is a relationship between gender and fear of walking alone at night.

x = men, women

fear = no, yes

alpha = .05

Get chi-square critical value

critical chi-square = 3.841 [df = (2-1)(2-1) = 1]

Draw Diagram.  See Board.

 

 

Calculate chi-square.

Observed Gender
Fear of Walking Alone at Night Men Women Totals
No 186/75% 94/38.06% 280/56.57%
Yes 62/25% 153/61.94% 215/43.43%
248/100% 247/100% 495/100%

*Always make IV the column variable. Always calculate percentages based on column totals.

Expected Gender
Fear of Walking Alone at Night Men Women Totals
No 140.28 139.72 280
Yes 107.72 107.28 215
248 247 495

 

fe = (each column marginal)(each row marginal)
                            n

Men/No = (280*248)/495 = 140.28

Men/Yes = (215*248)/495 = 107.72

Women/No = (280*247)/495 = 139.72

Women/Yes = (215*247)/495 = 107.28

 

fo fe fo-fe (fo-fe)2 (fo-fe)2
fe
men/no 186 140.28 45.72 2090.32 14.90
men/yes 62 107.72 -45.72 2090.32 19.40
women/no 94 139.72 -45.72 2090.32 14.96
women/yes 153 107.28 45.72 2090.32 19.48
total 68.74

Calculated chi-square = 68.74

Reject the null.  There is a relationship between gender and fear of walking alone at night.  

Go to observed table to interpret relationship.  

Compare column %'s.  Look down columns for patterns. 

What do most of the men respond with?  Not afraid

What do most of the women respond with?  Afraid.

What is the pattern?  Women are more likely to fear walking alone at night than are men. 

 


Example 2.

What is the relationship between social class (IV) and perceived health (DV)? 

x = low, middle, high social class

y = poor, fair, good health  

Ho: There is no relationship between social class and health.

H1: There is a relationship between social class and health.

Alpha = .05

Degrees of freedom = (3-1)(3-1) = 4

Get chi-square critical value

critical chi-square =  9.488

Make diagram.  See board.

 

 

Below is a crosstab of the relationship between social class and health from a national sample. 

Observed Social Class
Perceived Health Low Middle High Total
Poor 15/39% 31/12% 18/9% 64/13%
Fair 14/36% 114/45% 57/28% 185/37%
Good 10/25% 109/43% 127/63% 246/50%
Totals 39/100% 254/100% 202/100% 495/100%

*Always make the IV the column variable.  And always calculate percentages based out of the column totals.

Need to calculate the expected frequencies based on if the null was correct.

fe = (each column marginal)(each row marginal)
                            n

Make an expected frequency table.

Expected

Social Class

Perceived Health Low Middle High Total
Poor  5.04 32.84 26.12 64/13%
Fair 14.58 94.93 75.49 185/37%
Good 19.38 126.33  100.39 246/50%
Totals 39 254 202 495

low/poor =  (64*39)/495 = 5.04

low/fair = (185*39)/495 = 14.58

low/good = (246*39)/495 = 19.38

middle/poor = (64*254)/495 = 32.84

middle/fair = (185*254)/495 = 94.93

middle/good = (246*254)/495 = 126.33

high/poor =  (64*202)/495 =  26.12

high/fair = (185*202)/495 =  75.49

high/good = (246*202)/495 = 100.39

 

Now calculate chi-square

fo fe fo-fe (fo-fe)2 (fo-fe)2
fe
low/poor 15 5.04 9.96 99.2 19.68
low/fair 14 14.58 -.58 .34 .02
low/good 10 19.38 -9.38 87.98 4.54
middle/poor 31 32.84 -1.84 3.39 .10
middle/fair 114 94.93 19.07 363.66 3.83
middle/good 109 126.23 -17.23 296.87 2.35
high/poor 18 26.12 -8.12 65.93 2.52
high/fair 57 75.49 -18.49 341.88 4.53
high/good 127 100.39 26.61 708.09 7.05
Total

44.62

calculated chi-square = 44.62

Reject null.  There is a relationship between social class and health.

Have to see observed table to interpret the relationship.

Compare column %'s.  Look down columns for patterns.

What do most of the low class people respond with?  Poor health

What do most of the middle class people respond with?  Fair health

What do most of the high class people respond with?  Good health

What is the pattern?  As social class increases, people's perceived health improves.  People in the high social class are most likely to perceive themselves in good health.  People in the middle social class are most likely to perceive themselves in fair health.  People in the lower social class are most likely to perceive themselves in poor health.


Limitations of chi-square

1.  Heavily influenced by n.  As n increases, chi-square increases, independent of the relationship between x and y.

If there is a large n, you will likely find statistically significant relationships when in reality there is no  relationship.

If there is a small n, you will likely not find statistically significant relationships even if in reality there is a relationship.

2. Sensitive to small expected frequencies.  If fe<5 in 1 or more cells, chi-square is unstable.