LINEAR AND ORDER STATISTICS COMBINERS FOR PATTERN CLASSIFICATION
Kagan Tumer Joydeep Ghosh
Several researchers have experimentally shown that substantial
improvements can be obtained in difficult pattern recognition
problems by combining or integrating the outputs of multiple
classifiers.
This chapter provides an analytical framework to {\em quantify} the
improvements in classification results due to combining. The
results apply to both linear combiners and order statistics combiners.
We first show that to a first order approximation,
the error rate obtained over and above the Bayes error rate, is
directly proportional to the variance of the actual decision boundaries
around the Bayes optimum boundary.
Combining classifiers in output space
reduces this variance, and hence reduces the "added" error.
If $N$ unbiased classifiers are combined by simple averaging,
the added error rate can be reduced by a factor of $N$ if the
individual errors in approximating the decision boundaries are uncorrelated.
Expressions are then derived for linear combiners which are biased
or correlated, and the effect of output correlations on
ensemble performance is quantified.
For order statistics based non-linear combiners,
we derive expressions that indicate how much
the median, the maximum and in general the $i$th order statistic
can improve classifier performance.
The analysis presented here facilitates the
understanding of the relationships among error rates,
classifier boundary distributions, and combining in output space.
Experimental results on several public domain data sets
are provided to illustrate the benefits of combining and to support
the analytical results.
Return to Publications
Send comments to:
kagan@pine.ece.utexas.edu
Return to Kagan's home page