Observed variables often contain rogue outlier values that lie far away from the sample mean. Especially when dealing with small samples, outliers can bias the previous summary statistics away from values representative for majority of the sample.
This problem can be avoided either by eliminating or downweighting the outlier values in the sample (quality control), or by using statistics that are resistant to the presence of outliers. Note that the word robust should not be used to signify resistant since it is used in statistics to refer to insensitivity to choice of probability model rather than data value. Because the range is based on the extreme minimum and maximum values in the sample, it is a good example of a statistic that is not at all resistant to the presence of an outlier (and so should be interpreted very carefully !).
Resistant summary statistics can be obtained by using
the sample quantiles (percentiles/fractiles).
Quantiles are constructed by sorting (ranking) the data into
ascending order to obtain a sequence of
order statistics
.
The p'th quantile qp is then obtained by taking the 1+(n-1)p'th
order statistic
x1+(n-1)p (or an average of neigbouring values if
1+(n-1)p is not integer).
For example, the quartiles of the height example are given by
q0=161 (minimum value),
q0.25=171 (lower quartile),
q0.5=175 (median),
q0.75=180 (upper quartile), and
q1=190 (maximum value).
Unlike the arithmetic mean, the median is not at all influenced by
the exact value of the largest objects and so provides a resistant
measure of the central location.
Likewise, a resistant measure of the scale can be obtained using the
Inter-Quartile Range (IQR) given by the difference between the
upper and lower quartiles
q0.75-q0.25.
In the asymptotic limit of large sample size (
),
for normally distributed variables, the sample median tends to
the sample mean and the sample IQR tends to 1.34 times the
sample standard deviation.
More resistant measures of skewness and kurtosis also exist
such as L-moments but are beyond the scope of this course.
Refer to von Storch and Zwiers (1999) for more details.