Additive Groves of Regression Trees

MACHINE LEARNING: ECML 2007, PROCEEDINGS(2007)

引用 51|浏览1
暂无评分
摘要
We present a new regression algorithm called Additive Groves and show empirically that it is superior in performance to a number of other established regression methods. A single Grove is an additive model containing a small number of large trees. Trees added to a Grove are trained on the residual error of other trees already in the model. We be- gin the training process with a single small tree and gradually increase both the number of trees in the Grove and their size. This procedure ensures that the resulting model captures the additive structure of the response. A single Grove may still overfit to the training set, so we further decrease the variance of the final predictions with bagging. We show that in addition to exhibiting superior performance on a suite of regression test problems, Additive Groves are very resistant to overfitting. We present a new regression algorithm called Additive Groves, an ensemble of additive regression trees. We initialize a single Grove with a single small tree. The Grove is then gradually expanded: on every iteration either a new tree is added, or the trees that already are in the Grove are made larger. This process is designed to try to find the simplest model (a Grove with the fewest number of small trees) that captures the underlying additive structure of the target function. As training progesses, this algorithm yields a sequence of Groves of slowly increasing complexity. Eventually, the largest Groves may begin to overfit the training set even as they continue to learn important additive structure. This overfitting is reduced by applying bagging on top of the Grove learning process. In Section 2 we describe the Additive Groves algorithm step by step, be- ginning with the classical way of training additive models and incrementally making this process more complicated - and better performing - at each step. In Section 3 we compare Additive Groves with two other regression ensembles: bagged regression trees and stochastic gradient boosting. The results show that bagged Groves outperform these other methods and work especially well on highly non-linear data sets. In Section 4 we show that bagged Groves are resis- tant to overfitting. We conclude and discuss future work in Section 5.
更多
查看译文
关键词
new regression algorithm,bagged groves,regression test problem,additive structure,regression trees,additive groves,additive model,single grove,small number,resulting model,show empirically,established regression method,regression testing,regression tree
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要