ANOVA

Stands for "analysis of variance"

ANOVA is means test, just like the means tests you learned in the last section. But it enables you to compare more than 2 groups.

Used when:

DV = continuous

IV = categorical with more than 2 categories, usually 3-6 categories. You can do an ANOVA with an IV that has more than 6 categories, it is just cumbersome to interpret the results.

ANOVA uses an F test to compare the means of the groups.  An F distribution is very similar to a chi-square distribution.  An F test in ANOVA can only tell you if there is a relationship between two variables -- it can't tell you what that relationship is.  Mathematically, this means it can only tell you if one of the means of the groups is different from another one.  It can't tell you which mean is different.

The hypotheses we test with an F-test in ANOVA are:

Null:  There is no relationship between the IV and the DV (write in the names of the variables ...).  The means are equal.  Mean 1 = mean 2 = mean 3 .... (write out however many groups there are (i.e., write a mean for each of the categories of the IV). F = 0.

Research:  There is a relationship between the IV and the DV (write in the names of the variables ...).  The means are not equal.  Mean 1 mean 2 mean 3 .... (write out however many groups there are (i.e., write a mean for each of the categories of the IV).  F ≠ 0.

Then draw your diagram using the alpha that you set ahead of time.  For an F-test, there will always only be one tail (the right tail).  So you will never divide alpha in half.

Then do the ANOVA on the computer.  Look at the p value associated with the f that you get.  If it is lower than alpha, reject.  If it is higher than alpha, accept.  And then give your interpretation.

If  you reject, you say there is a relationship and then look at the means to determine which mean is higher or lower than the others.

If you accept, you say there is no relationship, and state what the mean is about for all of the groups.

Example 1

Does social class influence the number of hours worked?

IV = Social Class (4 categories; lower class, working class, middle class, upper class)

DV = Hours Worked per week

Null Hypothesis: There is no relationship between social class and number of hours worked.  The average number of hours worked for each social class is equal.  Mean 1 = mean 2 = mean 3 = mean 4.   F = 0.

Research Hypothesis:  There is a relationship between the social class and number of hours worked.    The average number of hours worked for each social class is not equal.  Mean 1 mean 2 mean 3 ≠ mean 4.. F ≠ 0.

Alpha = .05.  One tailed test (that is all an f-test can do).  Draw diagram.

From SPSS, we learn that  F = 3.85, p = .009

Reject the null.   There is a relationship between social class and number of hours worked.  People in the lower class work less often (about 36 hours a week) than people in the working, middle and upper classes.  People in the middle and working class work the most (an average of 42 hours a week).

Example 2.

Does race influence socio-economic status?

IV = race (3 categories, 1 = white, 2 = black, 3 = other)

DV = socio-economic index (range of 0-100)

Null Hypothesis: There is no relationship between race and SEI.  The average SEI for each race is equal.  Mean 1 = mean 2 = mean 3.   F = 0.

Research Hypothesis:  There is a relationship between the race and SEI.   The average SEI for each race is not equal.  Mean 1 mean 2 mean 3.  F ≠ 0.

Alpha = .05.  One tailed test (this is all f tests can do).  Draw diagram.

From SPSS we learn that  F = 20.29, p = .000.

Reject the null.   There is a statistical relationship between race and SEI.  The average SEI for black respondents is about 43, which is lower than the average SEI for white respondents (mean = 50) and respondents of other races (mean = 50) .

*Many of you could go on to explain, sociologically, why we got these results. If you can do that, great!

Take Home Example

Does educational degree attainment influence the number of hours of TV that people watch per day?

Alpha = .05

F = 29.18, p = .000

 N Mean Std. Deviation Std. Error 95% Confidence Interval for Mean Minimum Maximum Lower Bound Upper Bound 0 LT HIGH SCHOOL 290 4.11 3.314 .195 3.73 4.50 0 24 1 HIGH SCHOOL 985 3.04 2.551 .081 2.88 3.20 0 24 2 JUNIOR COLLEGE 133 2.44 1.554 .135 2.17 2.70 0 12 3 BACHELOR 282 2.42 2.165 .129 2.16 2.67 0 12 4 GRADUATE 129 1.64 1.274 .112 1.42 1.87 0 7 Total 1819 2.97 2.586 .061 2.85 3.09 0 24

Reject null.  Educational degree does influence the number of hours of TV that people watch.  People with less than a high school degree watch the most television per day (4.11 hours), and people with a graduate degree watch the least (2.97 hours).