Training with Deep Q-Network Agent

Question

wathek gatri am 7 Jun. 2022

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/1735285-training-with-deep-q-network-agent

Beantwortet: Anay am 21 Aug. 2025

Hello,

I'm trying to use the approach used here to train my system using basic reward function and descrete actions. The states are (x,y) coordinates of a certain moving point. Is there a ready made function to execute the training and update of ϕ? Or do I have to implement it manually?

Also the guide mentions 'Initialize the critic Q(s,a;ϕ) with random parameter values ϕ, and initialize the target critic parameters ϕt with the same values. ϕt=ϕ.'

But I set my critic in the following way and cant seem to find where the ϕvalues are stored ??!

-------------------------------------------------------------------------------------------------------------------------

obsInfo = rlNumericSpec([4 1]);

actInfo = rlFiniteSetSpec([2 1 -1]);

net = [featureInputLayer(4,'Normalization','none')

fullyConnectedLayer(3,'Name','value')];

net = dlnetwork(net);

critic = rlVectorQValueFunction(net,obsInfo,actInfo)

opt = rlDQNAgentOptions

agent = rlDQNAgent(critic,opt)

------------------------------------------------------------------------

Thanks!

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Anay am 21 Aug. 2025

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/1735285-training-with-deep-q-network-agent#answer_1569616

In MATLAB Online öffnen

Hi wathek,

The parameters ϕ are the weights and biases of the neural network. As mentioned in the documentation for the “rlVectorQValueFunction” in MATLAB, the learnable parameters of the critic are the weights of the deep neural network.

You do not need to implement the training and update of the parameters ϕ. The Reinforcement Learning Toolbox provides support for target networks in DQN. You don't need to manually implement the target network update. You just need to configure it properly in the agent options. Here is a sample code for your reference:

% Configure DQN agent options with target network settings
opt = rlDQNAgentOptions();
opt.TargetSmoothFactor = 1e-3;       % Controls how quickly target network updates (τ)
opt.TargetUpdateFrequency = 4;       % How often to update target network (in steps)
opt.ExperienceBufferLength = 10000;  % Replay buffer size
opt.MiniBatchSize = 64;              % Mini-batch size for training
opt.DiscountFactor = 0.99;           % Discount factor (γ)
% Create the DQN agent
agent = rlDQNAgent(critic, opt);

To train the agent, you'll need to create an environment and then use the train function. You can refer to the below example:

% Training options
trainOpts = rlTrainingOptions();
trainOpts.MaxEpisodes = 1000;
trainOpts.MaxStepsPerEpisode = 500;
trainOpts.Verbose = true;
trainOpts.Plots = "training-progress";
trainOpts.StopTrainingCriteria = "AverageReward";
trainOpts.StopTrainingValue = 200;
% Train the agent (env is your environment)
trainingStats = train(agent, env, trainOpts);

You can create an environment in MATLAB or Simulink. Refer to the below link to know more about this in MATLAB documentation:

https://www.mathworks.com/help/releases/R2022a/reinforcement-learning/ref/rl.agent.rlqagent.train.html#mw_6df99ad9-0c63-49f9-9d5b-26e552d62ae8

You can follow the below links to refer to the MATLAB documentation to learn more about configuring and training a DQN agent:

Hope this helps!

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Training with Deep Q-Network Agent

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Antworten (1)

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

Training with Deep Q-Network Agent

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Antworten (1)

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden