Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors No system or file called 'myfilenamemyfilename' found
8 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Hi,
I am trying to use a DDPG agent for continuous time control using RL agent block in Simulink. My system has 19 observation variables, and 4 action outputs, a custom reset function, and I will show the rest of the code below. To note is I already had a different continuous RL DDPG agent functioning as far as I can tell, which had 36 observation variables, 4 action outputs and a custom reset function. The only differences with this new one is that I have altered the simulink models reward function and observation space to align with the new observation variables. I have also altered the reset function for the new simulink due to its new name and order of variables. I have ensured there is one scalar value going into the reward input fot the RL agent and one scalar value going into the isdone input for the RL agent.
Also to note is before the error I recieve a warning about my observation space which I will also show below:
close all
clear all
%%%%%%%%%%%%%%%%%%%%%%% F16 Control Script %%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Running Parameters
mdl = 'f16rl';
% train = true;
agent_type = 'DDPG';
numActs = 4;
device = 'cpu';
is_parallel = false;
%% Add necessary folders to path
% add the stevens and lewis model
addpath([pwd '\StevensLewisF16_mk2']);
% load up the airconstants
airconst;
% low alpha x0
p0 = 0; % p (rad/s)
q0 = 0; % q (rad/s)
r0 = 0; % r (rad/s)
alpha0 = 1.8; % alpha (deg)
beta0 = 0; % beta (deg)
V0 = 220; % V (m/s)
phi0 = 0; % phi (deg)
theta0 = 0; % theta (deg)
psi0 = 0; % psi (deg)
xe0 = 0; % xe (m)
ye0 = 0; % ye (m)
h0 = 4572; % h (m)
dstab0 = -1.2; % dstab (deg)
ail0 = 0; % ail (deg)
rud0 = 0; % rud (deg)
throttle0 = 0; % throttle (0-1)
power0 = 18;
x0 = [p0,q0,r0,alpha0,beta0,V0,phi0,theta0,psi0,xe0,ye0,h0];
U0 = [dstab0,ail0,rud0,throttle0];
mod0 = linmod('f16lin',[x0, power0],U0);
p_min = -100; % (rad/s)
q_min = -100; % (rad/s)
r_min = -100; % (rad/s)
alpha_min = -20; % (deg)
beta_min = -30; % (deg)
V_min = 0; % (m/s)
phi_min = -180; % (deg)
theta_min = -180; % (deg)
psi_min = -180; % (deg)
xe_min = -10000; % (m)
ye_min = -10000; % (m)
h_min = 0; % (m)
engine_power_min = 0;
engine_power_max = 100000;
p_max = 100; % (rad/s)
q_max = 100; % (rad/s)
r_max = 100; % (rad/s)
alpha_max = 45; % (deg)
beta_max = 30; % (deg)
V_max = 343; % (m/s)
phi_max = 180; % (deg)
theta_max = 180; % (deg)
psi_max = 180; % (deg)
xe_max = 10000; % (m)
ye_max = 10000; % (m)
h_max = 100000; % (m)
alpha_w = 1.0;
beta_w = 1.0;
V_w = 1.0;
%% Create the environment
% This builds the gym environment by passing it the observation space, the
% action space, the simulink model containing the RL agent wrapper, and
% lastly attaching the reset function that has been defined in ResetFcn.m
% This returns and environment 'env', as well as
obsmin_int_errors = [-inf,-inf,-inf];
obsmin_errors = [-(alpha_max-alpha_min),-(beta_max-beta_min),-(V_max-V_min)];
obsmin_states = [p_min,q_min,r_min,alpha_min,beta_min,V_min,phi_min,theta_min,psi_min,xe_min,ye_min,h_min,engine_power_min];
obsmin = [obsmin_int_errors,obsmin_errors,obsmin_states];
obsmax_int_errors = [inf,inf,inf];
obsmax_errors = [(alpha_max-alpha_min),(beta_max-beta_min),(V_max-V_min)];
obsmax_states = [p_max,q_max,r_max,alpha_max,beta_max,V_max,phi_max,theta_max,psi_max,xe_max,ye_max,h_max,engine_power_max];
obsmax = [obsmax_int_errors,obsmax_errors,obsmax_states];
[env, obsInfo, actInfo] = init_env(obsmin,obsmax,numActs,mdl);
disp('environment created successfully')
%% init_DDPG_agent
% Specify the simulation time Tf and the agent sample time Ts in seconds
Ts = 0.5;
Tf = 200;
% Fix the random generator seed for reproducibility
rng(0);
% grab input and output dimensions of NN
numObservations = obsInfo.Dimension(1);
numActions = actInfo.Dimension(1);
% Create DDPG Agent
statePath = [
featureInputLayer(numObservations,'Normalization','none','Name','State')
fullyConnectedLayer(50,'Name','CriticStateFC1')
reluLayer('Name','CriticRelu1')
fullyConnectedLayer(50,'Name','CriticStateFC2');
reluLayer('Name','CriticRelu2')
fullyConnectedLayer(50,'Name','CriticStateFC3');
reluLayer('Name','CriticRelu3')
fullyConnectedLayer(50,'Name','CriticStateFC4');
reluLayer('Name','CriticRelu4')
fullyConnectedLayer(25,'Name','CriticStateFC5')];
actionPath = [
featureInputLayer(numActions,'Normalization','none','Name','Action')
fullyConnectedLayer(50,'Name','CriticActionFC1');
reluLayer('Name','CriticRelu5')
fullyConnectedLayer(50,'Name','CriticActionFC2');
reluLayer('Name','CriticRelu6')
fullyConnectedLayer(50,'Name','CriticActionFC3');
reluLayer('Name','CriticRelu7')
fullyConnectedLayer(25,'Name','CriticActionFC4')];
commonPath = [
additionLayer(2,'Name','add')
reluLayer('Name','CriticCommonRelu')
fullyConnectedLayer(1,'Name','CriticOutput')];
criticNetwork = layerGraph();
criticNetwork = addLayers(criticNetwork,statePath);
criticNetwork = addLayers(criticNetwork,actionPath);
criticNetwork = addLayers(criticNetwork,commonPath);
criticNetwork = connectLayers(criticNetwork,'CriticStateFC5','add/in1');
criticNetwork = connectLayers(criticNetwork,'CriticActionFC4','add/in2');
% Specify options for the critic representation
% usingrlRepresentationOptions
criticOpts = rlRepresentationOptions('LearnRate',1e-03,'GradientThreshold',1,'UseDevice',device);
% Create the critic representation using the specified deep neural network
% and options. You must also specify the action and observation specifications
% for the critic, which you obtain from the environment interface. For more
% information, see rlQValueRepresentation.
critic = rlQValueRepresentation(criticNetwork,obsInfo,actInfo,'Observation',{'State'},'Action',{'Action'},criticOpts);
% Given observations, a DDPG agent decides which action to take using an actor
% representation. To create the actor, first create a deep neural network with
% one input, the observation, and one output, the action.
% Construct the actor in a similar manner to the critic. For more information,
% see rlDeterministicActorRepresentation.
actorNetwork = [
featureInputLayer(numObservations,'Normalization','none','Name','State')
fullyConnectedLayer(36*3, 'Name','actorFC1')
tanhLayer('Name','actorTanh1')
fullyConnectedLayer(36*3, 'Name','actorFC2')
tanhLayer('Name','actorTanh2')
fullyConnectedLayer(36*3, 'Name','actorFC3')
tanhLayer('Name','actorTanh3')
fullyConnectedLayer(36*3, 'Name','actorFC4')
tanhLayer('Name','actorTanh4')
fullyConnectedLayer(36*3, 'Name','actorFC5')
tanhLayer('Name','actorTanh5')
fullyConnectedLayer(numActions,'Name','Action')
];
actorOptions = rlRepresentationOptions('LearnRate',1e-04,'GradientThreshold',1,'UseDevice',device);
actor = rlDeterministicActorRepresentation(actorNetwork,obsInfo,actInfo,'Observation',{'State'},'Action',{'Action'},actorOptions);
% To create the DDPG agent, first specify the DDPG agent options using
% rlDDPGAgentOptions, rlACAgentOptions for A2C...
agentOpts = rlDDPGAgentOptions(...
'SampleTime',Ts,...
'TargetSmoothFactor',1e-3,...
'DiscountFactor',1.0, ...
'MiniBatchSize',64, ...
'ExperienceBufferLength',1e6);
agentOpts.NoiseOptions.Variance = 0.3;
agentOpts.NoiseOptions.VarianceDecayRate = 1e-5;
% Then, create the DDPG agent using the specified actor representation,
% critic representation, and agent options. For more information, see rlDDPGAgent.
agent = rlDDPGAgent(actor,critic,agentOpts);
%% Train_DDPG
% To train the agent, first specify the training options. For this example,
% use the following options:
%
% - Run each training for at most 5000 episodes. Specify that each episode lasts
% for at most ceil(Tf/Ts) (that is 200) time steps.
%
% - Display the training progress in the Episode Manager dialog box (set the Plots
% option) and disable the command line display (set the Verbose option to false).
%
% - Stop training when the agent receives an average cumulative reward greater
% than 800 over 20 consecutive episodes. At this point, the agent can control
% the level of water in the tank.
%
% For more information, see rlTrainingOptions.
maxepisodes = 1000000;
maxsteps = ceil(Tf/Ts);
trainOpts = rlTrainingOptions(...
'MaxEpisodes',maxepisodes, ...
'MaxStepsPerEpisode',maxsteps, ...
'ScoreAveragingWindowLength',20, ...
'Verbose',false, ...
'Plots','training-progress',...
'StopTrainingCriteria','AverageReward',...
'StopTrainingValue',80000,...
'SaveAgentCriteria','AverageReward',...
'SaveAgentValue',2000);
if is_parallel == true
trainOpts.UseParallel = true;
trainOpts.ParallelizationOptions.Mode = "sync";
% trainOpts.ParallelizationOptions.DataToSendFromWorkers = "gradients";%for A3C
trainOpts.ParallelizationOptions.StepsUntilDataIsSent = 20;
trainOpts.ParallelizationOptions.WorkerRandomSeeds = -1;
trainOpts.StopOnError = 'off';
end
% Train the agent using the train function. Training is a computationally
% intensive process that takes several minutes to complete. To save time
% while running this example, load a pretrained agent by setting doTraining
% to false. To train the agent yourself, set doTraining to true.
doTraining = true;
if doTraining
% Train the agent.
trainingStats = train(agent,env,trainOpts);
else
% Load the pretrained agent for the example.
% load('WaterTankDDPG.mat','agent')
end
The command line output when running this is as follows:
Warning: Error occurred while executing the listener callback for event EpisodeFinished
defined for class rl.env.SimulinkEnvWithAgent:
Error using rl.policy.AbstractPolicy/getAction (line 258)
Invalid observation type or size.
Error in rl.agent.rlDDPGAgent/evaluateQ0Impl (line 141)
action = getAction(this, observation);
Error in rl.agent.AbstractAgent/evaluateQ0 (line 275)
q0 = evaluateQ0Impl(this,observation);
Error in rl.train.TrainingManager/update (line 134)
q0 = evaluateQ0(this.Agents(idx),epinfo(idx).InitialObservation);
Error in rl.train.TrainingManager>@(info)update(this,info) (line 437)
trainer.FinishedEpisodeFcn = @(info) update(this,info);
Error in rl.train.Trainer/notifyEpisodeFinishedAndCheckStopTrain (line 56)
stopTraining = this.FinishedEpisodeFcn(info);
Error in rl.train.SeriesTrainer>iUpdateEpisodeFinished (line 31)
notifyEpisodeFinishedAndCheckStopTrain(this,ed.Data);
Error in rl.train.SeriesTrainer>@(src,ed)iUpdateEpisodeFinished(this,ed) (line 17)
@(src,ed) iUpdateEpisodeFinished(this,ed));
Error in rl.env.AbstractEnv/notifyEpisodeFinished (line 324)
notify(this,'EpisodeFinished',ed);
Error in rl.env.SimulinkEnvWithAgent/executeSimsWrapper/nestedSimFinishedBC (line 222)
notifyEpisodeFinished(this,...
Error in rl.env.SimulinkEnvWithAgent>@(src,ed)nestedSimFinishedBC(ed) (line 232)
simlist(1) = event.listener(this.SimMgr,'SimulationFinished' ,@(src,ed)
nestedSimFinishedBC(ed));
Error in Simulink.SimulationManager/handleSimulationOutputAvailable
Error in
Simulink.SimulationManager>@(varargin)obj.handleSimulationOutputAvailable(varargin{:})
Error in MultiSim.internal.SimulationRunnerSerial/executeImplSingle
Error in MultiSim.internal.SimulationRunnerSerial/executeImpl
Error in Simulink.SimulationManager/executeSims
Error in Simulink.SimulationManagerEngine/executeSims
Error in rl.env.SimulinkEnvWithAgent/executeSimsWrapper (line 244)
executeSims(this.SimEngine,simfh,in);
Error in rl.env.SimulinkEnvWithAgent/simWrapper (line 267)
simouts = executeSimsWrapper(this,in,simfh,simouts,opts);
Error in rl.env.SimulinkEnvWithAgent/simWithPolicyImpl (line 424)
simouts = simWrapper(env,policy,simData,in,opts);
Error in rl.env.AbstractEnv/simWithPolicy (line 82)
[experiences,varargout{1:(nargout-1)}] =
simWithPolicyImpl(this,policy,opts,varargin{:});
Error in rl.task.SeriesTrainTask/runImpl (line 33)
[varargout{1},varargout{2}] = simWithPolicy(this.Env,this.Agent,simOpts);
Error in rl.task.Task/run (line 21)
[varargout{1:nargout}] = runImpl(this);
Error in rl.task.TaskSpec/internal_run (line 166)
[varargout{1:nargout}] = run(task);
Error in rl.task.TaskSpec/runDirect (line 170)
[this.Outputs{1:getNumOutputs(this)}] = internal_run(this);
Error in rl.task.TaskSpec/runScalarTask (line 194)
runDirect(this);
Error in rl.task.TaskSpec/run (line 69)
runScalarTask(task);
Error in rl.train.SeriesTrainer/run (line 24)
run(seriestaskspec);
Error in rl.train.TrainingManager/train (line 421)
run(trainer);
Error in rl.train.TrainingManager/run (line 211)
train(this);
Error in rl.agent.AbstractAgent/train (line 78)
TrainingStatistics = run(trainMgr);
Error in main_RL (line 277)
trainingStats = train(agent,env,trainOpts);
Caused by:
Error using rl.representation.rlAbstractRepresentation/validateInputData (line 525)
Input data dimensions must match the dimensions specified in the corresponding
observation and action info specifications.
> In rl.env/AbstractEnv/notifyEpisodeFinished (line 324)
In rl.env.SimulinkEnvWithAgent.executeSimsWrapper/nestedSimFinishedBC (line 222)
In rl.env.SimulinkEnvWithAgent>@(src,ed)nestedSimFinishedBC(ed) (line 232)
In Simulink/SimulationManager/handleSimulationOutputAvailable
In Simulink.SimulationManager>@(varargin)obj.handleSimulationOutputAvailable(varargin{:})
In MultiSim.internal/SimulationRunnerSerial/executeImplSingle
In MultiSim.internal/SimulationRunnerSerial/executeImpl
In Simulink/SimulationManager/executeSims
In Simulink/SimulationManagerEngine/executeSims
In rl.env/SimulinkEnvWithAgent/executeSimsWrapper (line 244)
In rl.env/SimulinkEnvWithAgent/simWrapper (line 267)
In rl.env/SimulinkEnvWithAgent/simWithPolicyImpl (line 424)
In rl.env/AbstractEnv/simWithPolicy (line 82)
In rl.task/SeriesTrainTask/runImpl (line 33)
In rl.task/Task/run (line 21)
In rl.task/TaskSpec/internal_run (line 166)
In rl.task/TaskSpec/runDirect (line 170)
In rl.task/TaskSpec/runScalarTask (line 194)
In rl.task/TaskSpec/run (line 69)
In rl.train/SeriesTrainer/run (line 24)
In rl.train/TrainingManager/train (line 421)
In rl.train/TrainingManager/run (line 211)
In rl.agent.AbstractAgent/train (line 78)
In main_RL (line 277)
Error using rl.env.AbstractEnv/simWithPolicy (line 82)
An error occurred while simulating "f16rl" with the agent "agent".
Error in rl.task.SeriesTrainTask/runImpl (line 33)
[varargout{1},varargout{2}] = simWithPolicy(this.Env,this.Agent,simOpts);
Error in rl.task.Task/run (line 21)
[varargout{1:nargout}] = runImpl(this);
Error in rl.task.TaskSpec/internal_run (line 166)
[varargout{1:nargout}] = run(task);
Error in rl.task.TaskSpec/runDirect (line 170)
[this.Outputs{1:getNumOutputs(this)}] = internal_run(this);
Error in rl.task.TaskSpec/runScalarTask (line 194)
runDirect(this);
Error in rl.task.TaskSpec/run (line 69)
runScalarTask(task);
Error in rl.train.SeriesTrainer/run (line 24)
run(seriestaskspec);
Error in rl.train.TrainingManager/train (line 421)
run(trainer);
Error in rl.train.TrainingManager/run (line 211)
train(this);
Error in rl.agent.AbstractAgent/train (line 78)
TrainingStatistics = run(trainMgr);
Error in main_RL (line 277)
trainingStats = train(agent,env,trainOpts);
Caused by:
Error using rl.env.SimulinkEnvWithAgent>localHandleSimoutErrors (line 681)
No system or file called 'f16rlf16rl' found.
A similar problem I have found is referenced below, but the solution is not what I need as I am pretty sure this is a continuous DDPG and so it should work.
ref(https://www.mathworks.com/matlabcentral/answers/663018-error-using-rl-env-simulinkenvwithagent-localhandlesimouterrors-line-689-by-rl-toolbox)
3 Kommentare
Antworten (0)
Siehe auch
Kategorien
Mehr zu Reinforcement Learning Toolbox finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!