just a quick question.
How can I stop the training of a deep network (LSTM for instance) in order to have weights and biases set accordingly with the minimum of the validation loss?
In other words what's the reason of having a validation set if the final network is NOT the one that minimize the validation loss because it's overtrained in any case?
Validation Patience parameter is not useful in this sense because it stops the training when it's too late and setting it too small could result in being stuck in a local minimum.
The only way I found is repeating the training with max epochs set where the minimum of validation loss in the first training is reached but it's a crazy solution...