Statistic | 9:30 Course (001) |
Mode | 83 |
Median | 81 |
Mean | 73 |
Range | 21 to 100 |
Standard Deviation | 21.38 |
Outliers (using 2 s rule) | 21 (score) |
Recalculated Mean w/o outlier | 76 |
Number As | 10 |
Number Bs | 7 |
Number Cs | 2 |
Number Ds | 4 |
Number Fs | 7 |
Q7 Response: Understand | 28 True, 1 False |
Q8 Response: Fair | 29 True |
SOC
301 Data Analysis, Spring 2004 Exam 1
ANSWERS
1. Identify the level of measurement of the variables below: 10 points
a. How many schools you applied to for college Interval
b. Residents’ satisfaction with the Wilmington Police Department (WPD) measured as very satisfied, somewhat satisfied, somewhat dissatisfied, very dissatisfied Ordinal
c. Whether or not a student owns an SUV Nominal
d. Marital status Nominal
e. Income measured as less than $10,000, $10,001 to $15,000, $15,001 to $20,000, more than $20,000 Ordinal
2. Identify the independent and dependent variable in the following research questions: 10 points
a. Does household income determine the number of colleges
that high school students apply to? IV = income DV = #
applications
b. Who commits more crime – people who own SUVs or people who do not?
IV = SUV ownership DV = # of crimes
c. Why does resident satisfaction with the WPD vary by race and ethnicity? IV = race and ethnicity DV = satisfaction with WPD
d. Does marital status influence the number of vacations people take each year? IV = marital status DV = # of vacations
e. How are attitudes about sexual diversity impacted by gender? IV = gender DV = attitudes about diversity
3. What is a spurious relationship? Give an example not used in class or in a homework assignment. Be sure to identify the spurious variable and why it is a spurious variable. 10 points
A
spurious relationship is a relationship that appears to exist between an
independent and dependent variable, but which is really caused by a third
variable that influences both the independent and dependent variables and which
is not being included in the analysis.
Examples from student exams:
food intake influences weight (spurious variable = exercise)
employment status influences poverty (spurious variable = the health of the economy, housing costs, number of children)
gender influences income (spurious variable = education)
race influences whether someone commits crime (spurious variable = poverty)
moving influences whether or not you smoke (spurious variable = stress)
4. Below is data from a random sample of Americans on how many hours of TV they
watch a day.
5, 0, 3, 0, 1, 2, 4, 2, 2
a. Create a frequency table for this data. 10 points
x | f | % | Valid% | Cum% |
0 | 2 | 22.22% | 22.22% | 22.22 |
1 | 1 | 11.11% | 11.11% | 33.33 |
2 | 3 | 33.33% | 33.33% | 66.66 |
3 | 1 | 11.11% | 11.11% | 77.77 |
4 | 1 | 11.11% | 11.11% | 88.88 |
5 | 1 | 11.11% | 11.11% | 99.99 |
b. What univariate statistics are appropriate for this data? 5 points
Level of Measurement is interval, so all univariate statistics apply.
c. Calculate the univariate statistics identified above. 20 points
Mode = 2; Median = 2; range = 0 to 5 or 5
Mean and Standard Deviation are in table below.
See board for math.
x | f | fx | x*x | x*x*f |
0 | 2 | 0 | 0 | 0 |
1 | 1 | 1 | 1 | 1 |
2 | 3 | 6 | 4 | 12 |
3 | 1 | 3 | 9 | 9 |
4 | 1 | 4 | 16 | 16 |
5 | 1 | 5 | 25 | 25 |
9 | 19 | 63 | ||
mean | 2.111111 | = 19/9 | s = 1.69 |
d. Are there any outliers in this data? Indicate which method you choose to identify outliers and the value of the outlier(s). 5 points
Using 99% rule, 5 is an outlier.
Using 2 s rule, no outliers (2.11 +/- 2*1.69 = 5.49 on high side and -1.22 on low side)
Using 3 s rule, no outliers (2.11 +/- 3*1.69 = 7.18 on high side and -2.96 on low side)
e. Based on your answer to d, which measure of central tendency should you use to summarize this data? Why? 5 points
Since there are no outliers, the mean is the best measure of central tendency.
5. Below is data from the GSS on whether or not respondents have a pistol or revolver in their home. Interpret all appropriate univariate statistics. (20 points)
|
|
Frequency |
Percent |
Valid Percent |
Cumulative Percent |
Mean Median Mode Std. Dev Variance Range |
|
Valid |
1 YES |
364 |
12.9 |
20.0 |
20.0 |
1.80 |
|
|
2 NO |
1459 |
51.8 |
80.0 |
100.0 |
2.00 |
|
|
Total |
1823 |
|
100.0 |
|
2 |
|
Missing |
0 NAP |
956 |
33.9 |
|
|
.400 |
|
|
8 DK |
1 |
.0 |
|
|
.160 |
|
|
9 NA |
14 |
.5 |
|
|
1 |
|
|
System |
23 |
.8 |
|
|
|
|
Total |
|
2817 |
100.0 |
|
|
|
This variable is nominal so the mode and range are the only appropriate statistics.
Mode: Most Americans (80%) do not have a pistol or revolver in their home.
Range: Some Americans have a pistol or revolver in their home
(20%) and some do not (80%).