DF-ReaL2Boost: A Hybrid Decision Forest with Real L2Boosted Decision Stumps - An Application to Credit Scoring Classification and Short Term Rainfall Forecasting


Zaman M. Faisal, Sumi S. Monira, Hideo Hirose

2011 International Conference on Data Engineering and Internet Technology (DEIT 2011), 15-17 March 2011, Bali, Indonesia

In this hybrid decision forest each individual base decision tree classifiers are integrated with an additional classifier model, the boosted decision stump. In the real boosted decision stump the class probability estimate is converted using the half-log ratio to a real valued scale. This value is then used to represent an observationfs contribution to the final overall model. Furthermore, observation weights for subsequent iterations are updated according to the binomial log-likelihood (L2) loss function, which is more robust against noisy outcomes. This boosted decision stump trained on the extra samples different than the base tree classifiers (which are defined as out-of-bag samples). This extra sample along with the subsample on which the base tree classifiers are trained approximates the original training set, so in this way we are utilizing the full training set to construct a hybrid decision forest with larger feature space. For a better training of the additional boosted decision stumps we have enlarged the extra sample size by using small subsample ratios s.t., 0.20, 0.30, 0.40 and 0.50. We have applied this hybrid decision forest in two real world applications; a) classifying credit scores and b) short term extreme rainfall forecast. To check its performance we have also compared the results with relevant prediction methods of the two applications. Overall results suggest that the new hybrid decision forest is capable of yielding commendable predictive performance in both the applications than most of the methods.

Key Words
decision forest; real adaboost; logistic loss; credit classification; rainfall forecast.



Times Cited in Web of Science:

Cited in Books: