Create custom policy function for a RL DQN.

Question

0 Stimmen

Hi Community,

I am working on a project that requires me to have a little bit modification of the DQN policy. The learned function is still Q, but instead of taking the argmax Q(s,a), I have a few more conditions added (most likely some if statement as hard constraints). I am wondering if it is ever possible for me to make this change? If so, where should i work on?

Best regards,

Yiyang

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Anmelden, um Aktivität zu verfolgen

Answer 1

Anh Tran am 27 Mär. 2020

In MATLAB Online öffnen

0 Stimmen

Currently there I do not see any workaround to modify DQN policy directly with buit-in rlDQNAgent. A possible workaround is to reimplement DQN agent with rlQValueRepresentation, introduced in MATLAB R2020a

You can refer to RL custom train loop example where we implement vanilla policy gradients with RL Toolbox.

For discrete action, I would recommend multi-output Q value representation Q(o) (better performance than Q(o,a)).

% create Q(o) critic, assumed you defined NeuralNet,ObservationInfo,ActionInfo
Critic = rlQValueRepresentation(NeuralNet,ObservationInfo,ActionInfo,'Observation',ObsLayerName);
% get state-action values of an observation RandomObservation
Q = getValue(Critic,RandomObservation)

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Create custom policy function for a RL DQN.

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Antworten (1)

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Kategorien

Produkte

Version

Tags

Community Treasure Hunt

Create custom policy function for a RL DQN.

0 Kommentare -2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Antworten (1)

0 Kommentare -2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Kategorien

Produkte

Version

Tags

Siehe auch

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden