Parallel Training rlDQNAgents with parfor fails for high agents numbers

1 Ansicht (letzte 30 Tage)
Dear Community,
i have a problem regarding the parallel training of rl Agents.
Description:
I'm initializing e.g. 1x100 rlDQNAgent as agenttrain with different parameter settings. They are all trained with the same trainingoptions in the same environment. The compressed version of the parallel training looks like this:
agentoutput = agenttrain;
parfor i = 1:100
out(i) = train(agenttrain(i),env,trainingOptions);
agentoutput(i) = agenttrain(i);
end
I'm initializing agentoutput in the parfor loop to get the changes in the network from every rlDQNAgent. When running this e.g. on 60 parallel workers, there's no problem. If i increase the number of agents (from 100 to 1000) i got the following error message:
During array expansion:
No default is defined for class 'rl.agent.rlDQNAgent'.
Method 'getDefaultScalarElement' in superclass rl.policy.AbstractPolicy is missing or
incorrectly defined.
Do you have any ideas, why this error just occures when the number of agents is higher?

Akzeptierte Antwort

Florian Rosner
Florian Rosner am 6 Aug. 2021
Based on a support request i could circumvent this issue with a workaround.
By using cell arrays the parfor loop works now:
parfor i = 1:numag
c1{i} = train(agenttrain(i),env,trainingOptions);
c2{i} = agenttrain(i);
end
out = [c1{:}];
agentoutput = [c2{:}];

Weitere Antworten (0)

Kategorien

Mehr zu Parallel and Cloud finden Sie in Help Center und File Exchange

Produkte


Version

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by