solve critic overestimate and how to explore specific action range
3 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
hello
im using a ddpg agent to tune a robot controller.all of my rewards are negetive and my critic learning rate is 0.01 and my actor learning rate is 0.0001 with adan optimizer and my gradient tresholds are 1. i have tow questions :
1- when my action ange is between [0.00001 0.2] my q0 predict a negetive value too(although with a large bias over actual value) but when my action range is between[0.00001 0.5] my critic have large overstimating around big positive values. why this happen with using bigger action range?
2- i define my action range between [0.00001 0.5] but i know my best action sit somewhere about [0.1 0.2] most of the time. how should i define my actor to explore this range more? is this related to noise option? how should i define ornstein-ohlenbeck noise option to explore this area?
![](https://www.mathworks.com/matlabcentral/answers/uploaded_files/1493642/image.png)
0 Kommentare
Antworten (0)
Siehe auch
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!