O'Brien, Deirdre and Gupta, Maya and Gray, Robert
For two-class classification, it is common to classify by setting a threshold on class probability estimates, where the threshold is determined by {ROC} curve analysis. An analog for multi-class classification is learning a new class partitioning of the multiclass probability simplex to minimize empirical misclassification costs. We analyze the interplay between systematic errors in the class probability estimates and cost matrices for multi-class classification. We explore the effect on the class partitioning of five different transformations of the cost matrix. Experiments on benchmark datasets with naive Bayes and quadratic discriminant analysis show the effectiveness of learning a new partition matrix compared to previously proposed methods.
Discussion