Examples of Secondary Analysis
Studies of voting patterns
of women after the right to vote
Types of Available Data
Population, family, housing,
social security and welfare, health and nutrition, crime and deviance,
education and training, work, income and wealth, culture and leisure, social
mobility and participation
Finding Datasets
Reference librarian
ICPSR (searchable dataset of over 17,000
datasets)
Evaluating the Appropriateness of Secondary Data
1. Does the data shed light
on your research question?
2. Are the concepts
operationalized appropriately for your research question?
Example:
pornography study
Is
there a possibility of creating an index?
Example: Pornography study
3. Does the data have the
same unit of analysis as your research question?
4. Does the sample represent
your population?
5. Can you specify an
accurate model? Are the independent
variables you need in the data?
Validity and Reliability in Secondary Data
Both have to do with operationalization
of your key concepts
Validity: Does the variable
in the dataset measure your concept?
Validity Examples:
Work Injury. You define as minor cuts, bruises,
sprains. Dataset from U.S. Government
defines as an injury that required physician or hospital visit.
Income: You define as individual income. Dataset defines it as household income.
Unemployment: You define as anyone not working but who
would work. U.S. Government data
defines unemployment as only people who are currently actively looking for work
(leaves out discouraged workers and underemployed and some self-employed). Economic vs. Social Policy goals.
Crime data: U.S. government data includes reported
crimes, not whether crime happened (victimization).
Premarital Pregnancy: research question is the number of marriages
that were caused by pregnancy. Data
used is birth certificates that include date of birth of baby and date of
marriage of parents. Problems:
Reliability: Is there a
chance in the conceptualization and operationalization of key phenomenon over
time?
Reliability Examples:
·
Unemployment. Used to be measured as # of unemployed / # of people in civilian
workforce (omitted military employees)
·
GDP. Went from Gross
National Product to Gross Domestic Product
(how treats income earned by foreigners here and Americans outside of
the country – now counted)
·
Crime rates increase
when police departments improve computer programs/systems
·
Units in national and
international datasets are not equally reliable:
·
National: police
departments, hospitals
·
International:
unemployment, poverty (and many other concepts) measured differently across
countries
·
Variation in
interviewer reliability
·
Missing Data: Can be a
huge problem in secondary data analysis. Since data is already collected there
is nothing you can do to prevent it. Can use statistical procedures to adjust
or correct data (imputation)
Ethical Issues in Secondary Data Analysis
Fewer concerns since you didn’t
collect the data and have no control over collection processes.
Need to maintain
confidentiality
Researcher integrity in
conducting statistical analyses