The online platform for Taylor & Francis Group content

Cookies Notification

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more.
Advanced and citation search

International Journal of Smart Engineering System Design

Volume 5, Issue 2, 2003

Translator disclaimer
Bayes Error Rate Estimation Using Classifier Ensembles

Bayes Error Rate Estimation Using Classifier Ensembles

DOI:
10.1080/10255810305042
Kagan Tumera & Joydeep Ghoshb

pages 95-109

Article Views: 14
Article usage statistics combine cumulative total PDF downloads and full-text HTML views from publication date (but no earlier than 25 Jun 2011, launch date of this website) to 12 Feb 2015. Article views are only counted from this site. Although these data are updated every 24 hours, there may be a 48-hour delay before the most recent numbers are available.

The Bayes error rate gives a statistical lower bound on the error achievable for a given classification problem and the associated choice of features. By reliably estimating this rate, one can assess the usefulness of the feature set that is being used for classification. Moreover, by comparing the accuracy achieved by a given classifier with the Bayes rate, one can quantify how effective that classifier is. Classical approaches for estimating or finding bounds for the Bayes error, in general, yield rather weak results for small sample sizes; unless the problem has some simple characteristics, such as Gaussian class-conditional likelihoods. This article shows how the outputs of a classifier ensemble can be used to provide reliable and easily obtainable estimates of the Bayes error with negligible extra computation. Three methods of varying sophistication are described. First, we present a framework that estimates the Bayes error when multiple classifiers, each providing an estimate of the a posteriori class probabilities, are combined through averaging. Second, we bolster this approach by adding an information theoretic measure of output correlation to the estimate. Finally, we discuss a more general method that just looks at the class labels indicated by ensemble members and provides error estimates based on the disagreements among classifiers. The methods are illustrated for artificial data, a difficult four-class problem involving underwater acoustic data, and two problems from the Proben1 benchmarks. For data sets with known Bayes error, the combiner-based methods introduced in this article outperform existing methods. The estimates obtained by the proposed methods also seem quite reliable for the real-life data sets for which the true Bayes rates are unknown.

Keywords

Related articles

View all related articles
 

Details

  • Published online: 29 Oct 2010

Author affiliations

  • a NASA Ames Research Center , Moffett Field, California, USA
  • b Department of Electrical and Computer Engineering , University of Texas , Austin, Texas, USA

Librarians

Taylor & Francis Group