Filter löschen
Filter löschen

How to know if an RL agent has been updated

8 Ansichten (letzte 30 Tage)
Haochen
Haochen am 16 Mai 2024
Kommentiert: Tejas am 23 Mai 2024
Hi all,
I want to train an RL agent, but would like to make sure that my agent is updated, so I want to ask how to see if the agent has been updated.
For example, in the official example of 'rl/TrainMultipleAgentsForAreaCoverageExample', I extracted the code related to the agent definition, the training and the simulation:
%...
agentA = rlPPOAgent(actor(1),critic(1),opt);
agentB = rlPPOAgent(actor(2),critic(2),opt);
agentC = rlPPOAgent(actor(3),critic(3),opt);
%...
if doTraining
result = train([agentA,agentB,agentC],env,trainOpts);
else
load("rlAreaCoverageAgents.mat");
end
%...
rng(0) % reset the random seed
simOpts = rlSimulationOptions(MaxSteps=maxsteps);
experience = sim(env,[agentA,agentB,agentC],simOpts);
However, say after training I would like to do a check on whether the agentA has changed or not:
copy = agentA;
%the above code section where agentA is trained...
disp(copy==agentA)
The result displayed is 1, so agentA has not been changed?
But this is from the official example so I believe the agents should indeed have been trained. And the simulation result also suggests that they have been trained since it takes sufficiently longer for an agent before train() to complete the task than the one after train().
It seems that train() does update agents, but how can I explicitly tell from the variables in my workspace that they are indeed updated? And why he above comparison is not working? Thank you.
Haochen Tao

Akzeptierte Antwort

Tejas
Tejas am 23 Mai 2024
Hi Haochen,
I also encountered the same issue while working with MATLAB R2024a. My understanding is that when a copy of an agent is created using the '=' operator, MATLAB simply creates a reference to the agent instead of a separate copy. As a result, both variables, the copy and agentA, refer to the same memory location. Therefore, even after the completion of training, they continue to point to the same location, which explains why disp(copy == agentA) yields 1.
To assess the differences in the agent before and after training, try this workaround:
  • Right after creating the copy of the agent, save it into a .MAT file.
copy = agentA;
save('copy.mat','copy');
  • Before proceeding with the comparison, load the copied agent from the .MAT file.
load('copy.mat');
disp(copy == agentA);
For more information on operations with .MAT files, please refer to the documentation below:
  2 Kommentare
Haochen
Haochen am 23 Mai 2024
Thank you,
After learning the documentation, my understand is that save() will duplicate the content elsewhere, and after the train() is executed and both 'copy' and 'agentA' are changed in the same way, the load() function will reassign the duplicated content back to 'copy'?
Tejas
Tejas am 23 Mai 2024
Yes, your understanding is correct.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by