How to modify actions in experiences during a reinforcement learning training
Ältere Kommentare anzeigen
Hi experts
I am doing a reinforcement learning project using reinforcement learning. The formulated problem has a huge discrete action set. So instead of using a Deep Q learning with discrete actions, I turned to DDPG with continuous action space. What I want to do is that after each time I got an action from the actor network, I discretize it to the closest VALID discrete action. Then what I want to store in the experience is not the original continuous action, but the closest discrete action. The DDPG training in Matlab seems to store the original action generated by the actor network plus noise by default. Is there any way to MODIFY the stored action in the experience before it is pushed in the memory buffer? Thanks!
1 Kommentar
Ran
am 29 Jul. 2022
Antworten (1)
Emmanouil Tzorakoleftherakis
am 29 Jul. 2022
1 Stimme
If you are working in Simulink, you can use the "Last Action" port in the RL Agent block to indicate what was the action that was actually applied to the environment.
If your environment is in MATLAB, you can either move it to Simulink with a MATLAB Fcn block and follow the above, or you can write your own custom training loop.
7 Kommentare
Ran
am 9 Aug. 2022
Ran
am 9 Aug. 2022
Emmanouil Tzorakoleftherakis
am 9 Aug. 2022
The "last action port" value will be the one stored in the experience buffer, not the actual output of the RK agent block
Ran
am 9 Aug. 2022
Emmanouil Tzorakoleftherakis
am 9 Aug. 2022
Bearbeitet: Emmanouil Tzorakoleftherakis
am 9 Aug. 2022
Ignoring observations, reward, IsDone, here is an example:

Also, make sure to only use the last action port with off-policy agents as mentioned in the doc. Hope this helps
Ran
am 9 Aug. 2022
Ran
am 11 Aug. 2022
Kategorien
Mehr zu Agents finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
