Bias Correction for the Trade-off Curve in the Tree-GA Bump Hunting


Yu Aizawa and Hideo Hirose


5th International Conference on E-Service and Knowledge Management (ESKM 2014), 8.31.2014 -9.4.2014, pp.126-130, Kitakyushu, Japan (2014)

The bump hunting, proposed by Friedman and Fisher, has become important in many fields such as market- ing and medical fields, and etc. Among them, to answer the unresolved question of molecular heterogeneity and of tumoral phenotype in cancer, the local sparse bump hunting algorithm, such as CART (Classification and Regression Trees) and PRIM (Patient Rule Induction Method), is useful. In the bump hunting, we use the trade-off curve as a criterion such that the algorithm works effectively, instead of the misclassification rate in classi- fication problems. The trade-off curve is constructed by finding the relation between the pureness rate and the capture rate. So far, we assessed the accuracy for the trade-off curve in typical fundamental cases that may be observed in real cases, and found that the proposed tree-GA can construct the effective trade-off curve. In addition, we investigated the prediction accuracy of the tree-GA by comparing the trade-off curve obtained by using the tree-GA with that obtained by using the PRIM, and found the superiority of the tree-GA over the PRIM when the sample size is large. In this paper, to focus on the sparse and small sample size cases observed in medical cases, we have investigated the typical fundamental cases using Monte Carlo simulations, and we found that the non-ignorable biases exist in the tree-GA. We have proposed a method here to remove such biases.

Key Words



Times Cited in Web of Science:

Cited in Books: