(I think)getValue function produces "Not enough input arguments"

Question

0 Stimmen

I am trying to create a minimax-q agent by changing the rlQAgent implementation in the Matlab folder. I directly copy the learn function from rlQAgent below:

function action = learn(this,exp)
            % learn from current experiences, return action with exploration
            % exp = {state,action,reward,nextstate,isdone}
            
            Observation     = exp{1};
            Action          = exp{2};
            Reward          = exp{3};
            NextObservation = exp{4};
            Done            = exp{5};
            
            if Done == 1
                % for final step, just use the immediate reward, since
                % there is no more a next state
                QTargetEstimate = Reward;
            else
                TargetQValue = getMaxQValue(this.Critic, NextObservation);
                % x = getValue(this.Critic, NextObservation);
                QTargetEstimate = Reward + this.AgentOptions.DiscountFactor * TargetQValue;
            end
            
            % update the critic with computed target
            CriticGradient = gradient(this.Critic,'loss-parameters',...
                    [Observation,Action], QTargetEstimate);
            this.Critic = optimize(this.Critic, CriticGradient);
            
            % update exploration model
            this.ExplorationModel = update(this.ExplorationModel);
            
            % compute action from the current policy
            action = getActionWithExploration(this, NextObservation);
end

I try to change the following

TargetQValue = getMaxQValue(this.Critic, NextObservation);

into

x = getValue(this.Critic, NextObservation);

this. I know that getValue function is supposed to return a matrix. Therefore, in the following line

QTargetEstimate = Reward + this.AgentOptions.DiscountFactor * TargetQValue;

I put a breakpoint to prevent further errors. This program terminates prematurely and does not return an "x" matrix producing "Not enough input arguments." error. Can anyone tell me what the problem is? Thanks.

Edit:

Error using rl.agent.AbstractPolicy/step (line 116)
Invalid input argument type or size such as observation, reward, isdone or loggedSignals.
Error in rl.env.MATLABEnvironment/simLoop (line 241)
                    action = step(policy,observation,reward,isdone);
Error in rl.env.MATLABEnvironment/simWithPolicyImpl (line 106)
                    [expcell{simCount},epinfo,siminfos{simCount}] = simLoop(env,policy,opts,simCount,usePCT);
Error in rl.env.AbstractEnv/simWithPolicy (line 70)
            [experiences,varargout{1:(nargout-1)}] = simWithPolicyImpl(this,policy,opts,varargin{:});
Error in rl.task.SeriesTrainTask/runImpl (line 33)
            [varargout{1},varargout{2}] = simWithPolicy(this.Env,this.Agent,simOpts);
Error in rl.task.Task/run (line 21)
            [varargout{1:nargout}] = runImpl(this);
Error in rl.task.TaskSpec/internal_run (line 159)
            [varargout{1:nargout}] = run(task);
Error in rl.task.TaskSpec/runDirect (line 163)
            [this.Outputs{1:getNumOutputs(this)}] = internal_run(this);
Error in rl.task.TaskSpec/runScalarTask (line 187)
                runDirect(this);
Error in rl.task.TaskSpec/run (line 69)
                runScalarTask(task);
Error in rl.train.SeriesTrainer/run (line 24)
            run(seriestaskspec);
Error in rl.train.TrainingManager/train (line 291)
            run(trainer);
Error in rl.train.TrainingManager/run (line 160)
            train(this);
Error in rl.agent.AbstractAgent/train (line 54)
TrainingStatistics = run(trainMgr);
Error in gridWorld (line 34)
    trainingStats = train(qAgent,env,trainOpts);
Caused by:
    Not enough input arguments.

This is the whole error log. Please note that in the rlQAgent.m doc, I only change the above proposed line, everything else is exactly the same.

6 Kommentare
4 ältere Kommentare anzeigen 4 ältere Kommentare ausblenden

Walter Roberson am 20 Jul. 2020

What code sequence and data is needed to trigger the error?

(Though I do not think I have the necessary toolboxes to test with.)

Cengiz Aksakal am 20 Jul. 2020

Bearbeitet: Cengiz Aksakal am 20 Jul. 2020

In MATLAB Online öffnen

% create the basic grid world environment
env = rlPredefinedEnv("BasicGridWorld");
% set the initial state to 2
env.ResetFcn = @() 2;
% fix the random generator seed for reproducibility
rng(0)
% create a Q table using the observation and action specifications from the grid world environment
qTable = rlTable(getObservationInfo(env),getActionInfo(env));
qRepresentation = rlQValueRepresentation(qTable,getObservationInfo(env),getActionInfo(env));
qRepresentation.Options.LearnRate = 1;
% create a Q-learning agent using this table representation and configure
% the epsilon-greedy exploration
agentOpts = rlQAgentOptions;
agentOpts.EpsilonGreedyExploration.Epsilon = .04;
qAgent = myMinimaxQAgent(qRepresentation, agentOpts);
% train q-learning agent
trainOpts = rlTrainingOptions;
trainOpts.MaxStepsPerEpisode = 50;
trainOpts.MaxEpisodes = 200;
trainOpts.StopTrainingCriteria = "AverageReward";
trainOpts.StopTrainingValue = 11;
trainOpts.ScoreAveragingWindowLength = 30;
% train the q-learning agent using the train function
doTraining = true;
% if doTraining
%     % train the agent
    trainingStats = train(qAgent,env,trainOpts);
% else
%     % Load the pretrained agent for the example.
%     load('basicGWQAgent.mat','qAgent')
% end
% just to see q values
% c = getCritic(qAgent);
% tab = getLearnableParameterValues(c);

For agent creation, I use the above shared "myMinimaxQAgent" class. The error occurs when I run this code.

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Follow Question

Answer 1

Madhav Thakker am 23 Jul. 2020

2 Stimmen

I understand that getValue(ciritc, obs) is throwing "Not Enough Input Arguments", however getMaxQValue(critic, obs) works fine. I assume that you want the values asscociated with complete action-set for a particular observation.

A different implementation of getValue() can be used in this case - getValue.

Particularly, value = getValue(qValueRep,obs,act) can be used. I am also attaching sample code for reference. I've tested it and it works as expected.

state = 2;

c = getCritic(qAgent);

for k = 1:length(c.ActionInfo.Elements)

action = c.ActionInfo.Elements(k);

getValue(c, state, action)

end

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Cengiz Aksakal am 23 Jul. 2020

Thanks for the answer. I did exactly what you did; used a for loop to get each action. However, I think getValue(qValueRep,obs) does not work as stated in the manual.

I even tried putting all the actions into an array, then into a single-cell array, and tried to use getValue(qValueRep,obs,action) but still did not work.

Anyways, anyone who use getValue function should use as you stated. Thanks again.

Melden Sie sich an, um zu kommentieren.

(I think)getValue function produces "Not enough input arguments"

6 Kommentare
4 ältere Kommentare anzeigen 4 ältere Kommentare ausblenden

Akzeptierte Antwort

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Weitere Antworten (0)

Kategorien

Produkte

Version

Tags

Community Treasure Hunt

(I think)getValue function produces "Not enough input arguments"

6 Kommentare 4 ältere Kommentare anzeigen 4 ältere Kommentare ausblenden

Akzeptierte Antwort

1 Kommentar -1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden

Weitere Antworten (0)

Kategorien

Produkte

Version

Tags

Siehe auch

Community Treasure Hunt

6 Kommentare
4 ältere Kommentare anzeigen 4 ältere Kommentare ausblenden

1 Kommentar
-1 ältere Kommentare anzeigen -1 ältere Kommentare ausblenden