Custom Action Space DDPG Reinforcement Learning Agent
Ältere Kommentare anzeigen
After running into a challenge with my reinforcement learning agent I hope you can help me with at least a little hint.
My DDPG agent has a continuous action space which works totally fine. Unfortunately it cannot get transfered to a real-life system this way. Trying to find an optimal value for the actions in different situations the agent should avoid certain combinations.
The action space is defined like:
actionInfo = rlNumericSpec([4 1], ...
'LowerLimit', [0; 0; 0; 0], ...
'UpperLimit', [maxA1; maxA2; maxA3; maxA4]);
But due to restrictions in the real-life system it should more be like
A1 = (0 || [minA1; maxA1])
to avoid actions in the range
A1 = ]0; minA1[
Is there any possibility to define my action space this way?
Note:
I have already tried to route the agent to avoid actions in this range by penalizing it via the reward but it doesn't seem to work out. Instead of steadily improving over the episodes it now tends more to a sideways movement after reaching a certain (not desirable) level.
Thanks in advance!
Akzeptierte Antwort
Weitere Antworten (0)
Kategorien
Mehr zu Reinforcement Learning finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!