why can not output optimal solution when validate agent?
Ältere Kommentare anzeigen
Hello everyone,
Topic: Reinforcement Learning, DQN Agent.
I have trained an agent with my dataset (total 28 training data) then validated all these data. Problem is i can not get optimal results at validation. Some of them were good but not every result was good.
- env: I custermized an environment.
- I create critic with this function: critic = rlVectorQValueFunction(nn,obsInfo,actInfo);
- With critic create an dqn agent: agent = rlDQNAgent(critic);
I have tried new agent with only 1 data. Training could get converged. Validation gave also right answer to this data. But i trained an agent with all 28 data using the same hyperparameter. Correctness is not garanteed.... I don't know what is reason. Because of too small dataset? or i gave wrong hyperparameter?
Hyperparameter of agent:
agent.AgentOptions.EpsilonGreedyExploration.EpsilonDecay = 0.9;
agent.AgentOptions.EpsilonGreedyExploration.Epsilon = 0.9;
agent.AgentOptions.EpsilonGreedyExploration.EpsilonMin = 0.001;
agent.AgentOptions.DiscountFactor = 0.99;
agent.AgentOptions.MiniBatchSize = 128;
agent.AgentOptions.CriticOptimizerOptions.LearnRate = 0.0008;
agent.AgentOptions.CriticOptimizerOptions.GradientThreshold = 1;
agent.AgentOptions.SaveExperienceBufferWithAgent=true;
Thank you
Kun
2 Kommentare
Emmanouil Tzorakoleftherakis
am 13 Jun. 2023
Are you using an IsDone signal? What do you mean by 28 training data? Do you mean 28 episodes? If that's the case, this number is really small. You need to at least give it a few hundred episodes to get an idea of how training progresses.
Akzeptierte Antwort
Weitere Antworten (0)
Kategorien
Mehr zu Training and Simulation finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

