Cross Validation
In this section we discuss how to evaluate a classifier $M$ using some performance measure $θ$. Typically, the input dataset $D$ is randomly split into a disjoint training set and testing set. The training set is used to learn the model $M$, and the testing set is used to evaluate the measure $θ$. However, how confident can we be about the classification performance? The results may be due to an artifact of the random split, for example, by random chance the testing set may have particularly easy (or hard) to classify points, leading to good (or poor) classifier performance. As such, a fixed, pre-defined partitioning of the dataset is not a good strategy for evaluating classifiers. Also note that, in general, $D$ is itself a d-dimensional multivariate random sample drawn from the true (unknown) joint probability density function $f(x)$ that represents the population of interest. Ideally, we would like to know the expected value $E[θ]$ of the performance measure over all possible te