How to Use the Reinforcement Learning Toolbox to Draw Observations While Training?

9 Ansichten (letzte 30 Tage)
Hi!
How to Use the Reinforcement Learning Toolbox to Draw Observations While Training?Here is my code:
ObservationInfo = rlNumericSpec([12 1]);
% Initialize Action settings
ActionInfo = rlNumericSpec([6 1], ...
'LowerLimit', [-1; -1; -1; -1; -1; -1], ...
'UpperLimit', [1; 1; 1; 1; 1; 1]);
%Env
env = rlFunctionEnv(ObservationInfo,ActionInfo,'myStepFunction','myResetFunction');
% Simulation time and sample rate
Ts = 0.02;
% %% Deep Neural Network Options
% %Define the critic network
statePath = [
imageInputLayer([12 1 1],'Normalization','none','Name','observation')
fullyConnectedLayer(400,'Name','CriticStateFC1')
reluLayer('Name', 'Criticrelu1')
fullyConnectedLayer(300,'Name','CriticStateFC2')];
actionPath = [
imageInputLayer([6 1 1],'Normalization','none','Name','action')
fullyConnectedLayer(300,'Name','CriticActionFC1')];
commonPath = [
additionLayer(2,'Name','add')
reluLayer('Name','CriticCommonRelu')
fullyConnectedLayer(1,'Name','CriticOutput')];
criticNetwork = layerGraph();
criticNetwork = addLayers(criticNetwork,statePath);
criticNetwork = addLayers(criticNetwork,actionPath);
criticNetwork = addLayers(criticNetwork,commonPath);
criticNetwork = connectLayers(criticNetwork,'CriticStateFC2','add/in1');
criticNetwork = connectLayers(criticNetwork,'CriticActionFC1','add/in2');
criticOpts = rlRepresentationOptions('LearnRate',1e-03,'GradientThreshold',1);
critic = rlQValueRepresentation(criticNetwork,ObservationInfo,ActionInfo,...
'Observation',{'observation'},'Action',{'action'},criticOpts);
%Define the actor network
actorNetwork = [
imageInputLayer([12 1 1],'Normalization','none','Name','observation')
fullyConnectedLayer(400,'Name','ActorFC1')
reluLayer('Name','ActorRelu1')
fullyConnectedLayer(300,'Name','ActorFC2')
reluLayer('Name','ActorRelu2')
fullyConnectedLayer(6,'Name','ActorFC3')
tanhLayer('Name','ActorTanh')
scalingLayer('Name','ActorScaling','Scale',max(ActionInfo.UpperLimit))];
actorOpts = rlRepresentationOptions('LearnRate',1e-04,'GradientThreshold',1);
actor = rlDeterministicActorRepresentation(actorNetwork,ObservationInfo,ActionInfo,'Observation',{'observation'},'Action',{'ActorScaling'},actorOpts);
%% Set Agent and DDPG Options
agentOpts = rlDDPGAgentOptions(...
'SampleTime',Ts,...
'TargetSmoothFactor',1e-3,...
'ExperienceBufferLength',1e5,...
'DiscountFactor',0.99,...
'MiniBatchSize',128);
agentOpts.NoiseOptions.Variance = 0.6;
agentOpts.NoiseOptions.VarianceDecayRate = 1e-5;
agent = rlDDPGAgent(actor,critic,agentOpts);
%% Set Training Options
maxepisodes = 100;
trainOpts = rlTrainingOptions(...
'MaxEpisodes',maxepisodes,...
'MaxStepsPerEpisode',1000,...
'ScoreAveragingWindowLength',50,...
'Verbose',false,...
'Plots','training-progress',...
'StopTrainingCriteria','AverageReward',...
'StopTrainingValue',0,...
'SaveAgentCriteria','EpisodeReward',...
'SaveAgentValue',0);
%% Training
%Train the DDPG algorithm on the enviroment.
trainingStats = train(agent,env,trainOpts);
I would be grateful if you could help me!

Antworten (1)

Emmanouil Tzorakoleftherakis
You can use the information on plotting and visualization from this page to plot/visualize information during training

Produkte


Version

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by