Wednesday, March 6, 2013

Decision Trees and Gradient Boosting

Decision Trees

Decision tree algorithms search through the input space and find values of the input variables (split values) that maximize the differences in the target value between groups created by the split. The final model is characterized by the split values for each explanatory variable and creates a set of rules for classifying new cases.

Gradient Boosting

Boosting algorithms are ensemble methods that make predictions based on the average results of a series of weak learners. Gradient boosting involves fitting a series of trees, with each successive tree being fit to a resampled training set that is weighted according to the classification accuracy of the previously fit tree. The original training data is resampled several times and the combined series of trees form a single predictive model.  This differs from other ensemble methods using trees, such as random forests. Random forests are a modified type of bootstrap aggregation or bagging estimator (Freidman et al,2009). With random forests, we get a predictor that is an average of a series of trees grown on a bootstrap sample of the training data with only a random subset of the available inputs from the training data used to fit each tree (De Ville, 2006).  Gradient boosting can perform similarly to random forests and boosting may tend to dominate bagging methods in many applications. (Freidman et al,2009).


Friedman, Jerome H. (2001), Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29, 1189-1232. Available at http://stat.stanford.

Hasti, Tibshirani and Friedman. (2009)Elements of Statistical Learning: Data Mining,Inference, and Prediction. Second Edition. Springer-Verlag.

DeVille, Barry. (2006). Decision Trees for Business Intelligence and Data Mining Using SAS®  Enterprise Miner. SAS® Institute.

No comments:

Post a Comment