M-estimators generalize least squares and maximum likelihood, having certain properties that minimize sensitivity to outliers.
M-estimators address sensitivity to outliers by replacing (x-c)2 with another function that gives less weight to extreme values, which otherwise can exert leverage or 'pull' mean estimates in the direction of the tail of a distribution.
Wilcox provides a basic expression for deriving an M-estimator:
Choose μ m to satisfy ∑ ψ (xi -μ m )/ τ =0
τ = a measurement of scale
In Hoaglin, Mosteller, & Tukey a more detailed expression is given:
M-estimators are a family of estimators, all involving the same scale measure or auxiliary scale parameter, the value of the tuning constant determining the individual member of the family.
Choose Tn to satisfy ∑ ψ (xi -Tn )/ cSn = 0
Sn = auxillary estimator of scale
c = tuning constant
Properties of M-Estimators:
In Hoaglin, Mosteller, and Tukey, Goodall describes two important properties of robust M-estimators:
Resistance: an estimator is resistant if it is altered to only a limited extent by a small proportion of outliers. The estimator breaks down if the proportion becomes too large.
Robustness of Efficiency: over a range of distributions, the variance (or mean squared error) is close to minimum for each distribution. This guarantees an estimator is good when repeated samples are drawn from a distribution that is not known precisely.
Goodall give examples of M-estimators including Tukey's biweight, Huber's, or Andrew's Wave.
Introduction to Robust Estimation and Hypothesis Testing. 2nd Edition. Rand R. Wilcox. Elsevier. 2005.
Understanding Robust and Explorotory Data Analysis. David C. Hoaglin. Frederick Mosteller. John W. Tukey. Wiley. 1983