Maximising Model Performance Via Hyperparameter Selection
Machine learning models comprise two parameter types:
Model parameters - These are typically simply referred to as ‘parameters’ and are the parameters that are learned from data via the model training process;
Hyperparameters - These describe all the parameters that can be explicitly set by the model owner prior to training. This might comprise elements of the model architecture (e.g. number of hidden layers in a feed-forward neural network) or the learning process (e.g. whether to use Adam optimisation, the strength of the ridge regression penalty term, the learning rate).
The performance of a trained machine learning model can be very sensitive to the hyperparameter settings and thus it is important for the model owner to choose settings carefully in order to maximise the model’s efficacy (‘hyperparameter optimisation’). In particular, it is often practicable to optimise for hyperparameters for relatively low marginal compute cost when compared to competing methodologies for increasing model performance (for example, the use of larger training datasets or deeper models or even choice of model).
This guide will help you to improve model performance (that is, the model’s ‘efficacy’) via the process of hyperparameter optimisation. Moreover, to fully explain the implementation of hyperparameter optimisation, this guide will additionally also discuss validation methodologies which, in particular, seek to ensure that models perform well on new, unseen data.
We present the roadmap in two steps:
Step 1: Validation
Step 2: Hyperparameter search
before concluding with topics for further consideration.