how to manage the time of updating the network weights in DRL

Question

MOHAMMADREZA am 10 Mär. 2025

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/2174999-how-to-manage-the-time-of-updating-the-network-weights-in-drl

Beantwortet: Jack am 10 Mär. 2025

Hi , I am trying to write an DRL agent. Actually, I do not need to update the weights of the NNs at each step, but every n steps. How can I manage it? in Particular, I do not know when updating weights happens. Is it after exiting the step function?

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Jack am 10 Mär. 2025

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/2174999-how-to-manage-the-time-of-updating-the-network-weights-in-drl#answer_1561520

In MATLAB Online öffnen

You can control when the network weights update by simply decoupling the weight update routine from every single step. If you’re writing your own training loop, you could maintain a counter that increments on each step, and only call the weight update function every n steps. For example, if you're using a custom loop:

counter = 0;
n = 10; % update every 10 steps
while training
    % Take an action, observe state and reward, etc.
    counter = counter + 1;
    
    % Store experience, perform other step-related tasks
    
    % Update weights only every n steps
    if counter == n
        updateNetworkWeights(); % Your function to perform a training update
        counter = 0;
    end
end

If you’re using a built-in agent from MATLAB’s Reinforcement Learning Toolbox, check the agent options. For example, in a DQN agent, there is a property called LearnStepPeriod in DQNAgentOptions which lets you specify the number of steps between network training updates.

Regarding “when” the weights are updated, typically the update happens at the end of the step function (i.e., after the environment returns the next state and reward). This way, you ensure that the experience from that step is included before updating the network.

Follow me so you can message me anytime with future questions. If this helps, please accept the answer and upvote it as well.

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

how to manage the time of updating the network weights in DRL

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Antworten (1)

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Community Treasure Hunt

how to manage the time of updating the network weights in DRL

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Antworten (1)

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden