Impact of Outliers on Distributions

Outliers are isolated extreme high or low values.  If they exist, the distribution is skewed  in the direction of the outlier(s).

A.  How to identify outliers:

a.  Outside 2 standard deviations

b.  Outside 3 standard deviations

c.  Outside 99th %

d. Depends on the study, and the variable

B.  Outlier Affect on Central Tendency

1.  Has little impact on mode, median

2.  Big impact on mean:

Extremely high values pull the mean up.

Extremely low values pull the mean down.

Ex. Age  

Age 99 pulls mean up to 60

Age 10 pulls mean down to 19

3.  In a normally distributed variable, there are no extreme outliers.

C.  Outlier Affect on Dispersion:

1.  Big impact on range, variance, and standard deviation.

2.  Remove/transform them before calculating standard deviation.