Main Content

Divide Data for Optimal Neural Network Training

This topic presents part of a typical multilayer network workflow. For more information and other steps, see Multilayer Shallow Neural Networks and Backpropagation Training.

When training multilayer networks, the general practice is to first divide the data into three subsets. The first subset is the training set, which is used for computing the gradient and updating the network weights and biases. The second subset is the validation set. The error on the validation set is monitored during the training process. The validation error normally decreases during the initial phase of training, as does the training set error. However, when the network begins to overfit the data, the error on the validation set typically begins to rise. The network weights and biases are saved at the minimum of the validation set error. This technique is discussed in more detail in Improve Shallow Neural Network Generalization and Avoid Overfitting.

The test set error is not used during training, but it is used to compare different models. It is also useful to plot the test set error during the training process. If the error on the test set reaches a minimum at a significantly different iteration number than the validation set error, this might indicate a poor division of the data set.

There are four functions provided for dividing data into training, validation and test sets. They are dividerand (the default), divideblock, divideint, and divideind. The data division is normally performed automatically when you train the network.

Function

Algorithm

dividerand

Divide the data randomly (default)

divideblock

Divide the data into contiguous blocks

divideint

Divide the data using an interleaved selection

divideind

Divide the data by index

You can access or change the division function for your network with this property:

net.divideFcn

Each of the division functions takes parameters that customize its behavior. These values are stored and can be changed with the following network property:

net.divideParam

The divide function is accessed automatically whenever the network is trained, and is used to divide the data into training, validation and testing subsets. If net.divideFcn is set to 'dividerand' (the default), then the data is randomly divided into the three subsets using the division parameters net.divideParam.trainRatio, net.divideParam.valRatio, and net.divideParam.testRatio. The fraction of data that is placed in the training set is trainRatio/(trainRatio+valRatio+testRatio), with a similar formula for the other two sets. The default ratios for training, testing and validation are 0.7, 0.15 and 0.15, respectively.

If net.divideFcn is set to 'divideblock', then the data is divided into three subsets using three contiguous blocks of the original data set (training taking the first block, validation the second and testing the third). The fraction of the original data that goes into each subset is determined by the same three division parameters used for dividerand.

If net.divideFcn is set to 'divideint', then the data is divided by an interleaved method, as in dealing a deck of cards. It is done so that different percentages of data go into the three subsets. The fraction of the original data that goes into each subset is determined by the same three division parameters used for dividerand.

When net.divideFcn is set to 'divideind', the data is divided by index. The indices for the three subsets are defined by the division parameters net.divideParam.trainInd, net.divideParam.valInd and net.divideParam.testInd. The default assignment for these indices is the null array, so you must set the indices when using this option.