Filter löschen
Filter löschen

Q-table issues in the example "Q-learning in the basic grid world"

1 Ansicht (letzte 30 Tage)
Fangyuan Chang
Fangyuan Chang am 9 Nov. 2020
Kommentiert: Adi Firdaus am 11 Dez. 2021
I trained a Q-learning agent in the matlab predefined environment "BasicGridWorld". I have an issue about the updates of the Q-table. When I set the number of episode to be 1, and set the episode step to be 1, I expect that the new updated Q-value equals to (alpha * R) according to the Bellman equation, where alpha is the learning rate and R is the instant reward. However, the code generates a Q-value different from my expectation. Can anyone help? The code is attached as follows:
rng(0)
env = rlPredefinedEnv("BasicGridWorld");
qTable = rlTable(getObservationInfo(env),getActionInfo(env));
critic = rlQValueRepresentation(qTable,getObservationInfo(env),getActionInfo(env));
critic.Options.LearnRate = 0.1;
critic.Options.L2RegularizationFactor = 0;
critic.Options.Optimizer = "sgdm";
critic.Options.OptimizerParameters.Momentum = 0;
opts = rlQAgentOptions;
opts.EpsilonGreedyExploration.Epsilon = 0.8;
opts.EpsilonGreedyExploration.EpsilonMin = 0.01;
opts.EpsilonGreedyExploration.EpsilonDecay = 0.01;
opts.DiscountFactor = 0.5;
agent = rlQAgent(critic,opts);
trainOpts = rlTrainingOptions(...
'MaxEpisodes',1,...
'MaxStepsPerEpisode',1,...
'StopTrainingCriteria',"AverageReward",...
'StopTrainingValue',30,...
'Verbose',true,...
'Plots','none');
trainOpts.ScoreAveragingWindowLength = 50;
trainingStats = train(agent,env,trainOpts);
trained_critic=getCritic(agent);
trained_table = getLearnableParameters(trained_critic);
trained_qtable=trained_table{1};
% check the updated Q-value
[r,c]=find(trained_table{1,1}~=0);
Q_value = trained_table{1,1}(r,c)
Can anyone help point out my error?
Thank you very much.

Antworten (0)

Kategorien

Mehr zu Training and Simulation finden Sie in Help Center und File Exchange

Produkte


Version

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by