How does the L2 Regularization in a custom training loop work?
6 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
hanspeter
am 26 Sep. 2024
Beantwortet: Richard
am 26 Sep. 2024
Hi,
I implemented the custom training loop to train a sequence to sequence regression model. I also implemented the L2 regularization as described in the documentation here: https://de.mathworks.com/help/deeplearning/ug/specify-training-options-in-custom-training-loop.html#mw_50581933-e0ce-4670-9456-af23b2b6f337
Now I'm wondering how this works. If I have a look in other documentations like this one from Google, it seems to work differently. Google describes it as adding the square of the weights to the loss. In Matlab it looks like I add the weights to the gradients. Isn't that something different? Is one way better than the other?
Cheers
0 Kommentare
Akzeptierte Antwort
Richard
am 26 Sep. 2024
Short story: these are the same.
Long story: The link you gave explains the underlying mathematics of L2 regularization and its effects on weights. However it doesn't really explain the mechanics of how adding a regularization to the loss affects the minimization algorithm - it stops at saying you "minimize(loss + lambda*complexity)".
In this case the minimization is being done by taking steps based on the gradient of the total loss with respect to each weight. For each weight parameter the d(loss + lambda*complexity)/dw that is required is equal to dL/dw + d(lambda*complexity)/dw, i.e. the gradients are the sum of the non-regularized gradient and a regularization term. It is this sum which the MATLAB example is performing.
0 Kommentare
Weitere Antworten (0)
Siehe auch
Kategorien
Mehr zu Image Data Workflows finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!