Applying reinforcement learning with two continuous actions. During training one varies but the other is virtually static.

Hello,
I am trying to train the DDPG agent to control the vehicle's (model:Kinetmatic) steering angle and velocity. The purpose is to train the agent so the vehicle can move from an initial x,y, theta position to final x,y,theta position. One agent is to perform both actions.
The ranges are [-0.78,+0.78] and [-2.5 and 2.5]. In the actor network, a tanh is used and scaling [0.78; 2.5]. During the training, I realised the steering angle is not changing=>stuck at 0.78, but the velocity varies and this affects the training. What could be the reason for this? Is a single agent okay to perform the task? I am still learning RL. Any suggestion would be helpful.

Antworten (1)

You should be able to use a single agent for this task. Since you are using DDPG, the first thing I would check is whether the noise options are set properly for both inputs.

5 Kommentare

Followup to the question:
Reward = -(Xdiff^2+Ydiff^2)+ 20(distance < 0.5 and abs(theta_diff) <0.5)
Using DDPG, the agent is able to reduce the distance and reach the x,y position, but not able to get the orientation angle tolerance required to gain the positive reward. The motive is to receive 1 time positive reward (20) when agent attains distance AND angle condition only. No continuous +ve reward would be used. Is there an improvement or suggestion to the reward or any issue with using DDPG ?
Sparse rewards are a bit more challenging because the agent will need to hit the exact triggering condition to get it. By the way what you have here is not a one-time reward, unless you also have "distance < 0.5 and abs(theta_diff) <0.5" as your IsDone signal. You can maybe play with the weight factor or increase the min distance from 0.5.
By the way, there is a very similar example in Reinforcement Learning Toolbox here. You can use that to get some ideas as well
Yes, it is the IsDone signal. I have tried to open the link you shared but its not working. Could you share again

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Reinforcement Learning Toolbox finden Sie in Hilfe-Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by