Step 2: Hyperparameter Search
Once the validation method has been decided upon (as per Step 1) and the performance measure (for example, model accuracy) has been determined, the search to determine the hyperparameter settings (the ‘best hyperparameters settings’) that yield the ‘best’ model performance can be initiated. Although a model owner is able to set out to find such settings via a manual process of trial and error, we discuss in this step three methods for hyperparameter search that benefit from automation: i) random search; ii) grid search; and iii) automated hyperparameter tuning.
Random Search
In the random search process, the model owner provides a (possibly many-dimensional) grid of hyperparameters, that is, a set of possible settings for each hyperparameter, that determines the universe of possible combinations of hyperparameters that could be tested for model performance. The process randomly samples hyperparameter settings from this universe to validate model performance over. The method can be summarised as follows:
The model user provides a grid of hyperparameters, a validation method (including a setting for k, if needed) and a number of iterations (‘n_iter’).
The steps for the given validation method are then taken where the sets of hyperparameter settings used in step 2 (of either the three-way holdout method or the k-fold cross-validation method) are provided by n_iter random samples of complete hyperparameter settings sampled from the grid of hyperparameters.
Random search can be performed in Python, using the holdout method, with the Scikit learn function ParameterSampler(). Random search, with k-fold cross-validation or LOOCV embedded, can be effected using Scikit learn’s RandomizedSearchCV() function.
Grid Search
The grid search process is similar to the random search process. However, it differs in that all possible combinations of hyperparameters that can be generated from the owner-provided grid of hyperparameters are tested for model performance. Thus, this method can be summarised as follows:
The model user provides a grid of hyperparameters and a validation method (including a setting for k, if needed).
The steps for the given validation method are then taken where the sets of hyperparameter settings used in step 2 (of either the three-way holdout method or the k-fold cross-validation method) are provided by the entire set of possible hyperparameter combinations that can be generated from the grid of hyperparameters.
It should be noted that, where the hyperparameter grid comprises a large number of hyperparameters and a large number of possible settings for each hyperparameter, the number of possible hyperparameter combinations can become very large indeed.
Grid search can be performed in Python, using the holdout method, with the Scikit learn function ParameterGrid(). Grid search, with k-fold cross-validation or LOOCV embedded, can be effected using Scikit learn’s GridSearchCV() function.
Automated Hyperparameter Tuning
Automated hyperparameter tuning methods seek to bring greater efficiency to the hyperparameter optimisation process than that shown by the random search and grid search methodologies presented thus far. We consider two such methods, i) Bayesian optimisation and ii) evolutionary algorithms, both of which direct their search for the ‘best hyperparameters’ by considering the model performance results thus far seen.
Bayesian optimisation uses Bayes Theorem to direct the search in order to find the minimum (for example, we could maximise model (generalisation) accuracy by seeking the minimum of -accuracy) or maximum (here, we could simply maximise accuracy) of an objective function (see Wu et al., 2019 for a discussion of Bayesian optimisation use for hyperparameter optimisation).
Bayesian optimisation can be performed in Python using the Hyperopt package.
Evolutionary algorithms work over a population of hyperparameter settings, and use a natural selection process to isolate the hyperparameter settings that generate the best model performance. Those best performing settings are then able to combine in order to generate offspring, and the process is repeated. Tani et al., 2021 provide a good description of hyperparameter optimisation using evolutionary algorithms.
TPOT is a suitable package for performing hyperparameter search using evolutionary algorithms and is able to use the machine learning models of the Scikit learn library.
It should be noted that the evolutionary algorithm technique can actually become computationally very expensive if operating over a large initial population (that is, a population of candidate hyperparameter settings) size, as model validation needs to take place on all population members, and then their offspring, and so on. It can be seen that the Bayesian optimisation method can be inserted into step 2 of the discussed validation methodologies in much the same way as that for random search. A maximum number of evaluations can be provided for the method in order to define the number of iterations of step 2.
For an example of hyperparameter optimization you can access our GitHub page here or download the following file:
Last updated