|
|
|
|
|
Estimation of Optimal Sample Size in Decision
Forest of SVM with Embedded
Cross-validation Method
|
|
|
|
|
|
|
|
Faisal Zaman and Hideo Hirose
|
|
|
|
|
|
|
|
3rd
Asian Conference on Intelligent Information and Database Systems
(ACIIDS 2011), Daegu, Korea on April 20-22, 2011. |
|
|
|
|
|
In
this paper the performance of the $m$-out-of-$n$ decision forest
of SVM without replacement with different subsampling ratio ($frac{m}{n}$)
is analyzed in terms of an emph{embedded cross-validation} technique.
The subsampling ratio plays a pivotal role in improving the performance
of the decision forest of SVM. Because the SVM in this ensemble
enlarge the feature space of the underlying base decision tree
classifiers and guarantees a improved performance of the ensemble
overall. To ensure the better training of the SVM generally the
out-of-bag sample is kept larger but there is no general rule to
estimate the optimal sample size for the decision forest. In this
paper we propose to use the embedded cross-validation method to
select the a near optimum value of the sampling ratio. In our criterion
the decision forest of SVM trained on independent samples whose
size is such that the cross-validation error of that ensemble is
as low as possible, will produce an improved generalization performance
for the ensemble. |
|
|
|
|
Optimal
sampling ratio, Decision forest of SVM, Embed- ded cross-validation
error .
|
|
|
|
|
|
|
|
|
@
Times Cited in Web of Science:
Cited in Books:
|
|
|
|
|
|
|
|