Training Quadrotor using PPO agent

Question

Mahmoud Chick Zaouali am 27 Apr. 2022

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/1706610-training-quadrotor-using-ppo-agent

Kommentiert: SALMAN IJAZ am 3 Dez. 2024 um 13:16

So I am trying to control a quadrotor model using Reinforcement learning. My agent will control my quadrotor and make it navigating to a desired position or following a path. Right now I am trying to train my PPO agent to hover the quadrotor. I built a dynamical model of the quadrotor with 6DOF block. After that I built the observation and reward function of my agent.

I coded the actor critic network and set my parameters.The problem is my reward function is always equals to 0 and my agent is not learning and I am suspuscious that I didn't build the environment correctly. I have been working on my model for long period and couldn't make my agent learn a little. I will really be glad if someone can support me on this issue.

I attached my quadrotor Reinforcement learning model with actor and critic codes.

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Unmanned Aerial and Space Systems am 2 Mai 2022

Hi, is there a solution for your problem, in my model, there is a same problem. I shared below:

https://www.mathworks.com/matlabcentral/answers/1708930-reinforcement-learning-based-quadrotor-control-using-soft-actor-critic-the-reward-is-not-converging?s_tid=prof_contriblnk

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Emmanouil Tzorakoleftherakis am 28 Apr. 2022

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/1706610-training-quadrotor-using-ppo-agent#answer_953225

Bearbeitet: Emmanouil Tzorakoleftherakis am 28 Apr. 2022

Hello,

There are multiple things not set up properly, including:

1) The isdone flag seems to be 1 all the time leading to episodes terminating early, after a single step

2) The reward signal is often not a scalar real number. One reason is that you are trying to calculate the sq root of a negative number

3) Your Simulink model has a lot of algebraic loops - I would get rid of those to make sure they don't interfere with training.

Hope that helps

2 Kommentare
Keine anzeigenKeine ausblenden

Unmanned Aerial and Space Systems am 2 Mai 2022

Hi, like this problem, I shared my model:

https://www.mathworks.com/matlabcentral/answers/1708930-reinforcement-learning-based-quadrotor-control-using-soft-actor-critic-the-reward-is-not-converging?s_tid=prof_contriblnk

SALMAN IJAZ am 3 Dez. 2024 um 13:16

Hello. is your issue resolved?

Melden Sie sich an, um zu kommentieren.

Training Quadrotor using PPO agent

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Antworten (1)

2 Kommentare
Keine anzeigenKeine ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

Training Quadrotor using PPO agent

1 Kommentar -1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

Antworten (1)

2 Kommentare Keine anzeigenKeine ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

1 Kommentar
-1 ältere Kommentare anzeigen-1 ältere Kommentare ausblenden

2 Kommentare
Keine anzeigenKeine ausblenden