Training agent in reinforcement learning: reproducibility of the code

Question

Duc Nguyen am 19 Feb. 2024

1
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/2083928-training-agent-in-reinforcement-learning-reproducibility-of-the-code

Kommentiert: Ari Biswas am 17 Jun. 2024

I get two different results from running this water-tank system example for reinforcement learning made by Mathworks:

https://uk.mathworks.com/help/reinforcement-learning/ug/create-simulink-environment-and-train-agent.html

This example has fixed the random number generator seed rng(0), so I expected the result to be the same on all computer. However, I ended up with two different agents on two computers:

Computer A finished training the agent after 86 episodes (just like the published example) and gave me an identical agent to the example.
Computer B needed 182 episodes to train the agent and gave me a different agent.

Both computers run MATLAB R2023b 64-bit on MS Windows 10. The code is unchanged from the example (except for changing doTraining = false to doTraining = true).

Computer A has an 8-core i7 processor. Computer B has a 6-core i7 processor.

I'm writing a tutorial for a univeristy-level course, so reproducibility is necessary so that students can follow the example. Any tip on how to facilitate this is also much appreciated.

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Ari Biswas am 20 Feb. 2024

2
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/2083928-training-agent-in-reinforcement-learning-reproducibility-of-the-code#answer_1413003

This could also be as a result of slight variations in floating point numbers across the different computer architectures. These variations can add up to produce significant differences. That is why we cannot always guarantee reproducibility of results in our examples. It is ideal to have the same environment when trying to reproduce the training. If you cannot ensure everyone has the same system configuration, then it would be good to vary the random seed or the agent hyperparameters a little bit to get better training performance.

2 Kommentare
Keine anzeigenKeine ausblenden

轩 am 16 Jun. 2024

Does this phenomenon occur in other frame like pytorch ?

And in some MathWorks rl example, the reproducibility is perfect between two different computers, which confuses me. When should I conside the the factor of computer hardware ?!

Please help and thank you in advance !

Ari Biswas am 17 Jun. 2024

I would believe it would be difficult to guarantee reproducibility across platforms or hardware, irrespective of the framework (MATLAB or PyTorch). Please check information on PyTorch documentation.

Difference in hardware COULD affect reproducibility, it all comes down to how floating point numbers are computed by the architectures. If your example results are reproducible you dont need to worry too much. If they are not, one of the contributing factors could be difference in hardware.

When you are sharing your work with others it can be good practice to cite the hardware configuration. That way they would know the system the work was originally performed on.

Melden Sie sich an, um zu kommentieren.

Training agent in reinforcement learning: reproducibility of the code

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

2 Kommentare
Keine anzeigenKeine ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

Training agent in reinforcement learning: reproducibility of the code

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

2 Kommentare Keine anzeigenKeine ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

2 Kommentare
Keine anzeigenKeine ausblenden