is actor-critic agent learning?

karim bio gassi

21 Aug. 2021

1 Antwort

Aktualisiert 4 Okt. 2022

2 Ansichten (30 Tage)

Melden Sie sich an, um diese Frage zu beantworten.

Follow Question

Melden Sie sich an, um diese Frage zu beantworten.

Follow Question

Ältere Kommentare anzeigen

0 Stimmen

I built a actor critic agent for microgrid energy management. it has to decide the discharging/charging energy among a set of action

in total 9 action can be taken for 7008 time steps. I am training the agent over 2000 episodes. But I notice when the agent start learning at a cetain episodes,

at the next episodes it fall completely down. I tattached the training for the first 250 episodes.

I wonder if there something wrong in my code.

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Follow Question

Antworten (1)

Ahmed R. Sayed am 4 Okt. 2022

0 Stimmen

Hi, karim bio gassi,

From your figure, the discounted reward value is very large. try to rescale it to a certain value [-10, 10] in the environment. For example, r(t) = 10 * Microgrid operational cost (t) / MaxCost , where MaxCost is the maximum possible cost per time step.

Another point is you can use another agent.

I hope these suggestions can solve your concerns.