Step 3: Performing algorithmic selection

Besides selecting the right hyperparameters, we may want to extend our techniques to help us select the right algorithms. For example, we may want to choose between a logistic regression model or a decision tree. We will describe here in detail how the nested k-fold cross-validation technique can help us select the best-performing algorithm.

Nested k-fold cross-validation

Nested cross-validation is a widely used technique for algorithmic selection. We personally recommend this method, since it has been proven to have low bias and it works well even for smaller datasets (Iizuka et al., 2003; Varma and Simon, 2006).

The method is relatively simple. It consists of a nesting of two k-fold cross-validation loops, with the inner loop being responsible for the hyperparameter tuning, and the outer loop being responsible for estimating the generalization accuracy.

An alternative methodology would be to have bootstrapping in the inner loop, instead of cross-validation.

For an example of algorithim selection using cross-validation you can access our GitHub page here or download the following file:

Last updated