Step 2: Hyperparameter Search
Last updated
Last updated
Once the validation method has been decided upon (as per Step 1) and the performance measure (for example, model accuracy) has been determined, the search to determine the hyperparameter settings (the ‘best hyperparameters settings’) that yield the ‘best’ model performance can be initiated. Although a model owner is able to set out to find such settings via a manual process of trial and error, we discuss in this step three methods for hyperparameter search that benefit from automation: i) random search; ii) grid search; and iii) automated hyperparameter tuning.
In the random search process, the model owner provides a (possibly many-dimensional) grid of hyperparameters, that is, a set of possible settings for each hyperparameter, that determines the universe of possible combinations of hyperparameters that could be tested for model performance. The process randomly samples hyperparameter settings from this universe to validate model performance over. The method can be summarised as follows:
The model user provides a grid of hyperparameters, a validation method (including a setting for k, if needed) and a number of iterations (‘n_iter’).
The steps for the given validation method are then taken where the sets of hyperparameter settings used in step 2 (of either the three-way holdout method or the k-fold cross-validation method) are provided by n_iter random samples of complete hyperparameter settings sampled from the grid of hyperparameters.
Random search can be performed in Python, using the holdout method, with the Scikit learn function . Random search, with k-fold cross-validation or LOOCV embedded, can be effected using Scikit learn’s function.
The grid search process is similar to the random search process. However, it differs in that all possible combinations of hyperparameters that can be generated from the owner-provided grid of hyperparameters are tested for model performance. Thus, this method can be summarised as follows:
The model user provides a grid of hyperparameters and a validation method (including a setting for k, if needed).
The steps for the given validation method are then taken where the sets of hyperparameter settings used in step 2 (of either the three-way holdout method or the k-fold cross-validation method) are provided by the entire set of possible hyperparameter combinations that can be generated from the grid of hyperparameters.
It should be noted that, where the hyperparameter grid comprises a large number of hyperparameters and a large number of possible settings for each hyperparameter, the number of possible hyperparameter combinations can become very large indeed.
Grid search can be performed in Python, using the holdout method, with the Scikit learn function . Grid search, with k-fold cross-validation or LOOCV embedded, can be effected using Scikit learn’s function.
Automated hyperparameter tuning methods seek to bring greater efficiency to the hyperparameter optimisation process than that shown by the random search and grid search methodologies presented thus far. We consider two such methods, i) Bayesian optimisation and ii) evolutionary algorithms, both of which direct their search for the ‘best hyperparameters’ by considering the model performance results thus far seen.
Bayesian optimisation uses Bayes Theorem to direct the search in order to find the minimum (for example, we could maximise model (generalisation) accuracy by seeking the minimum of -accuracy) or maximum (here, we could simply maximise accuracy) of an objective function (see for a discussion of Bayesian optimisation use for hyperparameter optimisation).
It should be noted that the evolutionary algorithm technique can actually become computationally very expensive if operating over a large initial population (that is, a population of candidate hyperparameter settings) size, as model validation needs to take place on all population members, and then their offspring, and so on. It can be seen that the Bayesian optimisation method can be inserted into step 2 of the discussed validation methodologies in much the same way as that for random search. A maximum number of evaluations can be provided for the method in order to define the number of iterations of step 2.
Bayesian optimisation can be performed in Python using the package.
Evolutionary algorithms work over a population of hyperparameter settings, and use a natural selection process to isolate the hyperparameter settings that generate the best model performance. Those best performing settings are then able to combine in order to generate offspring, and the process is repeated. provide a good description of hyperparameter optimisation using evolutionary algorithms.
is a suitable package for performing hyperparameter search using evolutionary algorithms and is able to use the machine learning models of the Scikit learn library.
For an example of hyperparameter optimization you can access our GitHub page or download the following file: