Is it possible to change RL action values under certain conditions?
Ältere Kommentare anzeigen
I want my agent to output a target value, but in certain situations (reward drops dramatically), I would want the agent to look for a better solution by letting him change the target value. I tried to use initial condition block in order to use the target value in the first place. However, my agent (PPO) always outputs an average value after some training episodes.
5 Kommentare
Emmanouil Tzorakoleftherakis
am 18 Mai 2021
Can you provide some more information? What do you mean by letting the agent change target value? Isn't that what is happening by default every time the agent takes an action? what is the envronment architecture?
black_cat
am 18 Mai 2021
Emmanouil Tzorakoleftherakis
am 19 Mai 2021
thanks. It's still not clear to me what you mean by "However, this results in having an output of 3 since the agent is averaging it during training". If it's best to output a 6, the agent should do so, why would it average the output? Unless you are talking about the average episode reward that you see in the episode manager?
Antworten (0)
Kategorien
Mehr zu Reinforcement Learning Toolbox finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!