Additional Considerations

There are a number of aspects pertaining to hyperparameter optimisation which have not been presented in this roadmap but are yet worthy of attention:

Setting of hyperparameter grids: The random search and grid search methods outlined in Step 2 require the model owner to declare the hyperparameter grids over which to conduct the search at the outset. Such declaration not only provides the search process with the bounds of the search, but also the density of the search. Poor setting of the search bounds can lead the search process to entirely omit searching over those parts of the hyperparameter space which yield models exhibiting the greatest expected generalisation performance (Hertel et al., 2020 ask whether the hyperparameter ranges are appropriate), whilst a low search density might lead to performance maxima being missed and high search density might lead to prohibitive computational costs. Inspection of the literature for search heuristics is therefore warranted.
Hyperparameter efficacy: Not all hyperparameters impact model performance equally. Thus, efficient hyperparameter optimisation processes should seek to concentrate search efforts on those hyperparameters that are able to yield the largest performance gains. Identifying the correct hyperparameters to focus on additionally informs the setting of hyperparameter grids for both the random search and grid search methods. See Hutter et al., 2014 for further information concerning hyperparameter importance.
Confidence bounds for model performance: This roadmap does not discuss the computation nor the usefulness of confidence bounds for the model performance figures generated via validation processes. When comparing hyperparameter sets, the uncertainty around estimated generalisation model performance should act as an additional piece of information when determining whether one set of hyperparameter settings can confidently be expected to outperform a second set of hyperparameter settings. For further reading, Raschka, 2020 provides a good presentation of the computation of confidence intervals for model performance.

PreviousStep 2: Hyperparameter Search NextHandling dataset shift

Last updated 2 years ago