<DRL_PPO>erro:Number of input layers for deep neural network must equal to number of observation specifications.

Question

祥 am 20 Mär. 2024

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/2096546-drl_ppo-erro-number-of-input-layers-for-deep-neural-network-must-equal-to-number-of-observation-spe

Beantwortet: Shivansh am 5 Apr. 2024

Hi guys

I have been using a Reinforcement learning toolbox recently and I am using PPO algorithm to train an agent in a custom environment.

Then, I encountered an error, my code and debug log are as follows, sincerely hope someone can help me, very grateful.

The debug logs are as follows:

error using rl.internal.validate.mapFunctionObservationInput

Number of input layers for deep neural network must equal to number of observation specifications.

error in rlValueFunction (line 92)

modelInputMap = rl.internal.validate.mapFunctionObservationInput(model,observationInfo,nameValueArgs.ObservationInputNames);

error in ppo0320 (第 53 行)

critic= rlValueFunction(criticdlnet1,obsInfo);

My code is as follows

slx = 'RLcontrolstrategy0312';   
open_system(slx);
agentblk = slx +"/agent";
%obsInfo actInfo(The problem might be here, right?)
numObs=49;
obsInfo=rlNumericSpec([49,1],'LowerLimit',0, 'UpperLimit',1);
numAct=6;
actInfo = rlNumericSpec([6,1], 'LowerLimit',[0 0 0 -1 -1 -1]','UpperLimit',[1 1 1 1 1 1]'); 
scale = [0.5 0.5 0.5 1 1 1]';
bias = [0.5 0.5 0.5 0 0 0]';
env = rlSimulinkEnv(slx,agentblk,obsInfo,actInfo);
Ts = 0.001;
Tf = 4;
rng(0)
%critic
cnet = [
    featureInputLayer(9,"Normalization","none","Name","name1")
    fullyConnectedLayer(256,"Name","fc1")
    concatenationLayer(1,2,"Name","concat")
    tanhLayer("Name","tanh1")
    fullyConnectedLayer(256,"Name","fc2")
    tanhLayer("Name","tanh2")
    fullyConnectedLayer(128,"Name","fc3")
    tanhLayer("Name","tanh3")
    fullyConnectedLayer(64,"Name","fc4")
    tanhLayer("Name","tanh4")
    fullyConnectedLayer(64,"Name","fc5")
    tanhLayer("Name","tanh5")
    fullyConnectedLayer(1,"Name","CriticOutput")];
cnetMC=[
    featureInputLayer(40,"Normalization","none","Name","name2")
    fullyConnectedLayer(512,"Name","fc11")
    tanhLayer("Name","tanh13")
    fullyConnectedLayer(128,"Name","fc14")
    tanhLayer("Name","tanh14")
    fullyConnectedLayer(64,"Name","fc15")];
criticNetwork = layerGraph(cnet);
criticNetwork = addLayers(criticNetwork, cnetMC);
criticNetwork = connectLayers(criticNetwork,"fc15","concat/in2");
criticdlnet = dlnetwork(criticNetwork,'Initialize',false);
criticdlnet1 = initialize(criticdlnet);
%(The problem might be here, right?)
critic= rlValueFunction(criticdlnet1,obsInfo);
%actor
anet = [
    featureInputLayer(9,"Normalization","none","Name","name1")
    fullyConnectedLayer(256,"Name","fc1")
    concatenationLayer(1,2,"Name","concat")
    tanhLayer("Name","tanh1")
    fullyConnectedLayer(256,"Name","fc2")
    tanhLayer("Name","tanh2")
    fullyConnectedLayer(128,"Name","fc3")
    tanhLayer("Name","tanh3")
    fullyConnectedLayer(64,"Name","fc4")
    tanhLayer("Name","tanh4")];
anetMC=[
    featureInputLayer(40,"Normalization","none","Name","name2")
    fullyConnectedLayer(512,"Name","fc11")
    tanhLayer("Name","tanh13")
    fullyConnectedLayer(128,"Name","fc14")
    tanhLayer("Name","tanh14")
    fullyConnectedLayer(64,"Name","fc15")];
meanPath = [
    fullyConnectedLayer(64,"Name","meanFC")
    tanhLayer("Name","tanh5")
    fullyConnectedLayer(numAct,"Name","mean")
    tanhLayer("Name","tanh6")
    scalingLayer(Name="meanPathOut",Scale=scale,Bias=bias)];
stdPath = [
    fullyConnectedLayer(64,"Name","stdFC")
    tanhLayer("Name","tanh7")
    fullyConnectedLayer(numAct,"Name","fc5")
    softplusLayer("Name","std")];
actorNetwork = layerGraph(anet);
actorNetwork = addLayers(actorNetwork,anetMC);
actorNetwork = connectLayers(actorNetwork,"fc15","concat/in2");
actorNetwork = addLayers(actorNetwork,meanPath);
actorNetwork = addLayers(actorNetwork,stdPath);
actorNetwork = connectLayers(actorNetwork,"tanh4","meanFC/in");
actorNetwork = connectLayers(actorNetwork,"tanh4","stdFC/in");
actordlnet = dlnetwork(actorNetwork);
%(The problem might be here, right?)
actor = rlContinuousGaussianActor(actordlnet,obsInfo,actInfo, ...
    "ActionMeanOutputNames","meanPathOut", ...
    "ActionStandardDeviationOutputNames","std");
% Agent hyperparameters
agentOptions=rlPPOAgentOptions("SampleTime",Ts,"DiscountFactor",0.995,"ExperienceHorizon",1024,"MiniBatchSize",512,"ClipFactor",0.2, ...
                               "EntropyLossWeight",0.01,"NumEpoch",8,"AdvantageEstimateMethod","gae","GAEFactor",0.98, ...
                               "NormalizedAdvantageMethod","current");
agentOptions.ActorOptimizerOptions=rlOptimizerOptions("LearnRate",0.0001,"GradientThreshold",1, ...
                                                      "L2RegularizationFactor",0.0004,"Algorithm","adam");
agentOptions.CriticOptimizerOptions=rlOptimizerOptions("LearnRate",0.0001,"GradientThreshold",1, ...
                                                      "L2RegularizationFactor",0.0004,"Algorithm","adam");
%Creating an Agent
agent=rlPPOAgent(actor,critic,agentOptions);
%training
trainOptions=rlTrainingOptions("StopOnError","on", "MaxEpisodes",2000,"MaxStepsPerEpisode",floor(Tf/Ts), ...
                            "ScoreAveragingWindowLength",10,"StopTrainingCriteria","AverageReward", ...
                            "StopTrainingValue",100000,"SaveAgentCriteria","None", ...
                            "SaveAgentDirectory","D:\car\jianmo\zhangxiang\agent","Verbose",false, ...
                            "Plots","training-progress");
doTraining = true; 
if doTraining    
    trainingStats = train(agent,env,trainOptions);
else
    load('agent.mat','agent')
end

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Shivansh am 5 Apr. 2024

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/2096546-drl_ppo-erro-number-of-input-layers-for-deep-neural-network-must-equal-to-number-of-observation-spe#answer_1437061

Hi,

The error message you're encountering, "Number of input layers for deep neural network must equal to number of observation specifications," suggests that there's a mismatch between the number of input layers defined in your neural networks (actor and critic) and the number of observations specified for your environment.

It looks like you have defined the observation space (obsInfo) with a dimension of [49,1], indicating that the environment will provide observations as 49-dimensional vectors. However, when constructing the neural networks for both the critic and actor, you have input layers (featureInputLayer) with dimensions that do not match this specification.

You need to make sure that the input layers of your networks match the dimensionality of the observation space provided by the environment. Since your observation space is 49-dimensional, your network should start with an input layer that matches this dimension if you're processing the observations as a whole in the network.

You can refer to the following documentation more information on the Reinforcement Learning toolbox in MATLAB:

https://www.mathworks.com/help/reinforcement-learning/index.html.

I hope it helps!