Monday, January 31, 2011


Anyone that has had a basic statistics course is familiar with the idea that the mean can be biased, or influenced by outliers. Basic mathematical statistics shows that the estimator for the mean can be derived by the method of least squares  i.e. min  ∑ (x-c) where c* = mean(x),  or if we specify a likelihood function using the Gaussian distribution, the  estimator for the mean can be derived using maximum likelihood.
M-estimators generalize least squares and maximum likelihood, having certain properties that minimize sensitivity to outliers. 

M-estimators address sensitivity to outliers by replacing (x-c) with another function that gives less weight to extreme values, which otherwise can exert leverage or 'pull' mean estimates in the direction of the tail of a distribution.  

Wilcox provides a basic expression for deriving an M-estimator:

Choose  μ m  to  satisfy  ∑ ψ (xim )/ τ  =0

 τ  = a measurement of scale

In Hoaglin, Mosteller, & Tukey a more detailed expression is given:

M-estimators are a family of estimators, all involving the same scale measure or auxiliary scale parameter, the value of the tuning constant determining the individual member of the family.

Choose Tn to satisfy  ∑ ψ (xi -Tn )/ cSn = 0

Sn = auxillary estimator of scale 
c = tuning constant

Properties of M-Estimators:

In Hoaglin,  Mosteller, and Tukey, Goodall describes two important properties of robust M-estimators:

Resistance: an estimator is resistant if it is altered to only a limited extent by a small proportion of outliers. The estimator breaks down if the proportion becomes too large. 

Robustness of Efficiency: over a range of distributions, the variance (or mean squared error) is close to minimum for each distribution.  This guarantees an estimator is good when repeated samples are drawn from a distribution that is not known precisely. 

Goodall give examples of M-estimators including Tukey's biweight, Huber's, or Andrew's Wave.


Introduction to Robust Estimation and Hypothesis Testing. 2nd Edition. Rand R. Wilcox.  Elsevier. 2005.

Understanding Robust and Explorotory Data Analysis. David C. Hoaglin. Frederick Mosteller. John W. Tukey. Wiley. 1983

No comments:

Post a Comment