Impact of Outliers on Distributions
Outliers are
isolated extreme high or low values. If they exist, the distribution is
skewed in the direction of the outlier(s).
A. How to identify
outliers:
a. Outside 2 standard deviations
b. Outside 3 standard deviations
c. Outside 99th %
d. Depends on the study, and the variable
B. Outlier
Affect on Central Tendency
1. Has little impact on mode, median
2. Big impact on mean:
Extremely high values pull the mean up.
Extremely low values pull the mean down.
Ex. Age
Age 99 pulls mean up to 60
Age 10 pulls mean down to 19
3. In a normally distributed variable, there are no extreme
outliers.
C. Outlier
Affect on Dispersion:
1. Big impact on range, variance, and standard deviation.
2. Remove/transform them
before calculating standard deviation.