Abstract

We explore the use of the supra-classifier framework in the construction of a classifier knowledge base. Previously, we introduced this framework within which labels produced by old classifiers are used to improve the generalization performance of a new classifier for a different but related classification task. We showed empirically that a simple Hamming nearest neighbor is superior to other techniques (e.g. MLP, decision trees, Naive Bayes, Combiners) as a supra-classifier. Here, we describe theoretically how the probability that the Hamming nearest neighbor supra-classifier will predict the true target class approaches certainty at an exponential rate as more classifiers are reused. The scalability of the Hamming nearest neighbor with large numbers of previously created classifiers makes it a good choice as a supra-classifier in the application of building a repository of domain knowledge organized as a classifier knowledge base.