|  |  |  |  |  | 
   
    | 
         
          | Classification performance of bagging and boosting
              type ensemble methods with
small training sets |  | 
   
    |  |  |  |  |  | 
   
    | 
         
          | Faisal ZAMAN, Hideo HIROSE |  | 
   
    |  |  |  |  |  | 
   
    | 
         
          |  
              New Generation Computing, special issue
                  on Hybrid and Ensemble Methods in Machine Learning, Vol. 29,
              No. 3, pp.277-292 , August 2011 |  | 
   
    |  |  |  | 
   
    | 
         
          | Classification
              performance of an ensemble method can be deciphered by studying
              the bias and variance contribution to its classification error.
              Statistically, the bias and variance of a single classifier is
              controlled by the size of the training set and the complexity of
              the classifier. It has been both theoretically and empirically
              established that the classification performance (hence bias and
              variance) of a single classifier can be improved partially by using
              a suitable ensemble method of the classifier and resampling the
              original training set. In this paper we have empirically examined
              the bias-variance decomposition of three different types of ensemble
              methods with different training sample sizes consisting of 10¥%
              to maximum 63¥% of the observations from the original training
              sample. First ensemble is bagging, second one is a boosting type
              ensemble named adaboost and the last one is a bagging type hybrid
              ensemble method, called bundling. All the ensembles are trained
              on training samples constructed with small subsampling ratios (SSR)
              0.10, 0.20, 0.30, 0.40, 0.50 and bootstrapping. The experiments
              are all done on 20 UCI Machine Learning repository datasets and
              designed to find out the optimal training sample size (smaller
              than the original training sample) for each ensemble and then find
              out the optimal ensemble with smaller trianing sets with respect
              to the bias-variance performance. The bias-variance decomposition
              of bundling show that this ensemble method with small subsamples
              have significantly lower bias and variance than subsampled and
          bootstrapped version of bagging and adaboost. |   
          |  |  |  |   
          | 
               
                | Classification
                    Performance, Bias-Variance Decomposition, Small Subsample,
                Bagging, Boosting |  |   
          |  |  |  |  | 
   
    |  |  |  |  |  | 
  
    |  |  |  |  |  | 
   
    | 
         
          |  
              Times Cited in Web of Science: 4 Times Cited in Google Scholar: 5 Cited in Books:  Cited in Proceedings:  Mathematical Review:  WoS: MEDICAL PHYSICS  巻: 40   号:
                10 記事番号: 101906 発行: OCT 2013; JOURNAL OF UNIVERSAL COMPUTER SCIENCE
                巻: 19 号: 4 ページ: 521-538 発行: 2013; INTERNATIONAL JOURNAL
                OF APPLIED MATHEMATICS AND COMPUTER SCIENCE 巻: 22 号: 4 ページ: 841-854
                DOI:
                10.2478/v10006-012-0062-1 発行: DEC 2012; INTERNATIONAL JOURNAL
                OF APPLIED MATHEMATICS AND COMPUTER SCIENCE 巻: 22 号: 4 ページ: 867-881
                DOI: 10.2478/v10006-012-0064-z
発行: DEC 2012 |   
          |  |  | 
   
    |  |  |  |  |  |