How to resume train a trained agent?about Q learning agents.
7 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Chao Wang
am 18 Apr. 2021
Kommentiert: Chao Wang
am 23 Apr. 2021
When I use Q Learning training, how can I continue to use the agent's experience resume training of the agent after I stop training midway?Will I still be able to see the progress of the training as I did the first time, and modify the parameters before resuming the training again?
by the way ,After the training, how to see the value in the QTable?Every time I open the QTable, all the values are 0.
I don´t understand why... Can somebody help me out on this?
Here is my code
clc;
clear;
mdl = 'FOUR_DG_0331';
open_system(mdl);
agentBlk = ["FOUR_DG_0331/RL Agent1", "FOUR_DG_0331/RL Agent2", "FOUR_DG_0331/RL Agent3", "FOUR_DG_0331/RL Agent4"];
oInfo = rlFiniteSetSpec([123,456,789]);
aInfo = rlFiniteSetSpec([150,160,170]);
aInfo1 = rlFiniteSetSpec([150,170]);
obsInfos = {oInfo,oInfo,oInfo,oInfo};
actInfos = {aInfo1,aInfo,aInfo,aInfo};
env = rlSimulinkEnv(mdl,agentBlk,obsInfos,actInfos);
Ts = 0.01;
Tf = 4;
rng(0);
qTable1 = rlTable(oInfo,aInfo1);
qTable2 = rlTable(oInfo,aInfo);
qTable3 = rlTable(oInfo,aInfo);
qTable4 = rlTable(oInfo,aInfo);
criticOpts = rlRepresentationOptions('LearnRate',0.1);
Critic1 = rlQValueRepresentation(qTable1,oInfo,aInfo1,criticOpts);
Critic2 = rlQValueRepresentation(qTable2,oInfo,aInfo,criticOpts);
Critic3 = rlQValueRepresentation(qTable3,oInfo,aInfo,criticOpts);
Critic4 = rlQValueRepresentation(qTable4,oInfo,aInfo,criticOpts);
%/*Code here for agent option**/
%... ....
%........
agent1 = rlQAgent(Critic1,QAgent_opt);
agent2 = rlQAgent(Critic2,QAgent_opt);
agent3 = rlQAgent(Critic3,QAgent_opt);
agent4 = rlQAgent(Critic4,QAgent_opt);
trainOpts = rlTrainingOptions;
trainOpts.MaxEpisodes = 1000;
trainOpts.MaxStepsPerEpisode = ceil(Tf/Ts);
trainOpts.StopTrainingCriteria = "EpisodeCount";
trainOpts.StopTrainingValue = 1000;
trainOpts.SaveAgentCriteria = "EpisodeCount";
trainOpts.SaveAgentValue = 15;
trainOpts.SaveAgentDirectory = "savedAgents";
trainOpts.Verbose = false;
trainOpts.Plots = "training-progress";
doTraining = false;
if doTraining
stats = train([agent1, agent2, agent3, agent4],env,trainOpts);
else
load(trainOpts.SaveAgentDirectory +"/Agents16.mat",'agent');
simOpts = rlSimulationOptions('MaxSteps',ceil(Tf/Ts));
experience = sim(env,[agent1 agent2 agent3 agent4 ],simOpts)
end
0 Kommentare
Akzeptierte Antwort
Emmanouil Tzorakoleftherakis
am 22 Apr. 2021
Hello,
Also, you don't have to do anything specific to continue training where you left off - you can simply begin training again and the policy will start from whatever values/weights it had previously.
Weitere Antworten (0)
Siehe auch
Kategorien
Mehr zu Training and Simulation finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!