General Information

While employing machine learning algorithm, there is a need to split the data such that the model is trained and on the basis of that model, we validate certain data that tells us about the general loss of the model.

The input data is split into three groups:

Training set : The training set is the one that is used to train the model, i.e, search for the parameters of the model.

Validation set : They are the set of data used to tune the hyperparameters of the model. For example, the validation set can be used to select the number of layers in a neural network.

Test set : Set of data used to assess the performance of the model.

Common ratios used for the split:

80% training set, 10% validation set, 10% test set

70% training set, 15% validation set, 15% test set

Last updated on Jun 01, 2022

Get AI confident. Start using Hasty today.

Automate 90% of the work, reduce your time to deployment by 40%, and replace your whole ML software stack with our platform.