SOC 301 Data Analysis, Spring 2004 Exam 1 Score Summary and Answers 

Statistic 9:30 Course (001)
Mode 83
Median 81
Mean 73
Range 21 to 100
Standard Deviation 21.38
Outliers (using 2 s rule) 21 (score)
Recalculated Mean w/o outlier 76
Number As 10
Number Bs 7
Number Cs 2
Number Ds 4
Number Fs 7
 Q7 Response: Understand 28 True, 1 False
Q8 Response: Fair 29 True
 
 

SOC 301 Data Analysis, Spring 2004 Exam 1               ANSWERS
 

1.  Identify the level of measurement of the variables below:  10 points                                           

a.       How many schools you applied to for college             Interval

b.       Residents’ satisfaction with the Wilmington Police Department (WPD) measured as very satisfied, somewhat satisfied, somewhat dissatisfied, very dissatisfied        Ordinal

c.       Whether or not a student owns an SUV  Nominal

d.       Marital status                       Nominal

e.       Income measured as less than $10,000, $10,001 to $15,000, $15,001 to $20,000, more than $20,000        Ordinal                                                                          

 

2. Identify the independent and dependent variable in the following research questions: 10 points

a. Does household income determine the number of colleges that high school students apply to?   IV = income DV = # applications

b. Who commits more crime – people who own SUVs or people who do not?  IV = SUV ownership DV = # of crimes

c. Why does resident satisfaction with the WPD vary by race and ethnicity? IV = race and ethnicity DV = satisfaction with WPD

d. Does marital status influence the number of vacations people take each year? IV = marital status DV = # of vacations

e. How are attitudes about sexual diversity impacted by gender?    IV = gender  DV = attitudes about diversity

 

3.  What is a spurious relationship? Give an example not used in class or in a homework assignment.  Be sure to identify the spurious variable and why it is a spurious variable.     10 points

A spurious relationship is a relationship that appears to exist between an independent and dependent variable, but which is really caused by a third variable that influences both the independent and dependent variables and which is not being included in the analysis.

Examples from student exams:


4.  Below is data from a random sample of Americans on how many hours of TV they watch a day.                                                        

                        5, 0, 3, 0, 1, 2, 4, 2, 2                                                     

 

a.       Create a frequency table for this data.  10 points     

x f % Valid% Cum%
0 2 22.22% 22.22% 22.22
1 1 11.11% 11.11% 33.33
2 3 33.33% 33.33% 66.66
3 1 11.11% 11.11% 77.77
4 1 11.11% 11.11% 88.88
5 1 11.11% 11.11% 99.99

                                                              

b.       What univariate statistics are appropriate for this data?  5 points

Level of Measurement is interval, so all univariate statistics apply.

c.       Calculate the univariate statistics identified above.  20 points

Mode = 2; Median = 2; range = 0 to 5 or 5

Mean and  Standard Deviation are in table below.  See board for math.
 

 

x f fx x*x x*x*f
0 2 0 0 0
1 1 1 1 1
2 3 6 4 12
3 1 3 9 9
4 1 4 16 16
5 1 5 25 25
9 19   63
       
mean 2.111111 = 19/9   s = 1.69

d.       Are there any outliers in this data?  Indicate which method you choose to identify outliers and the value of the outlier(s).  5 points

Using 99% rule, 5 is an outlier.

Using 2 s rule, no outliers (2.11 +/- 2*1.69 =  5.49 on high side and -1.22 on low side)

Using 3 s rule, no outliers (2.11 +/- 3*1.69 = 7.18 on high side and -2.96 on low side)

e.       Based on your answer to d, which measure of central tendency should you use to summarize this data?       Why?  5 points 

 Since there are no outliers, the mean is the best measure of central tendency.

5. Below is data from the GSS on whether or not respondents have a pistol or revolver in their home. Interpret all appropriate univariate statistics. (20 points)

 

 

 

Frequency

Percent

Valid Percent

Cumulative Percent

 

 

Mean

Median

Mode

Std. Dev

Variance

Range

 

Valid

1  YES

364

12.9

20.0

20.0

1.80

 

2  NO

1459

51.8

80.0

100.0

2.00

 

Total

1823

 

100.0

 

2

Missing

0  NAP

956

33.9

 

 

.400

 

8  DK

1

.0

 

 

.160

 

9  NA

14

.5

 

 

1

 

System

23

.8

 

 

 

Total

 

2817

100.0

 

 

 

 

This variable is nominal so the mode and range are the only appropriate statistics.

Mode: Most Americans (80%) do not have a pistol or revolver in their home.

Range: Some Americans have a pistol or revolver in their home (20%) and some do not (80%).