Home About Research Academic Projects R Packages Time Series Data Mining Multiple Instance Learning Publications Teaching Files Blog Links Contact

## Comparing classifiers

Finding datasets to compare classifiers has turned into a problem recently. Thanks to UCI repository for helping to a certain extent. But what if each paper uses its own experimentation strategy or the authors do not provide their code to enable fair comparison of the classifiers.  So here are two things:

1) If authors share their code, there is not a problem at all. Still parameter setting can be a headache.

2) Then the best thing to do is to fix the experimentation strategy to evaluate the performance of a classifier. Salzberg's (1997) [a] proposal is a good one in that sense:

- First divide the data set into k subsets for cross validation.

- We then run a cross-validation as follows.

(A) For each of the k subsets of the data set D, create a training set T = D - k.
(B) Divide each training set into two smaller subsets, T1 and T2. T1 will be used for training, and T2 for tuning. The parameters of the algorithm is tuned based on the error rates on T2. This way, the experimenter is forced to be more explicit about what those parameters are.
(C) Once the parameters are optimized, re-run training on the larger set T and measure the accuracy on subset k.
(D) Overall accuracy is averaged across all k partitions. Also variance can be estimated using the error rates of k partitions.

This is not stated in the paper but reliable estimates can be obtained by replicating the cross validation n times.  An R code that illustrates the experimentation strategy is provided here.  This code provides a basic scenario on Iris dataset. Suppose we would like to train a random forest and we are interested in finding the best setting for the number of features to be tried at each split (mtry parameter for the function randomForest). The number of trees are fixed to 50.  There are 4 features for Iris and the factors of 0.2, 0,4, 0.6 tried (which makes 1,2 and 3 features respectively).

[a] Salzberg, S. L. (1997): On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach. Data Mining and Knowledge Discovery 1:3