How to export and use only the end product of a reinforcement learning algorithm ?

Question

Michael Urbanski am 5 Aug. 2021

1
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/893022-how-to-export-and-use-only-the-end-product-of-a-reinforcement-learning-algorithm

Kommentiert: Emmanouil Tzorakoleftherakis am 15 Nov. 2023

Hello

I have used reinforcement learning to train a TD3 agent. Now I want to use this agent and actually deploy it as a controller in a simulink model, then possibly on an embedded platform. From what I understand about reinforcement learning, the actor network is the actual end product which computes the control action. Therefore, I don't want to export everything else with it as a RL agent representation, just the neural net. Is there something I should be wary of when doing this ? Also what simulink block can I use for a deep NN controller ? I am not sure if the predict block is suitable here as the task is not classification and the output should be an action, rather than likelihood percentages.

Also, I have created a TD3 agent with LSTM layers on MATLAB R2021a. When I try to import the agent to R2020b, which it is incompatible with, the agent surprisingly does get imported but when I try to simulate it to validate the results I get very different ones from when I try to do it on R2021a. Are the LSTM layers inside of the agent not working properly in R2020b or are they completely incompatible (for simulating the agent only, not training) ? Would doing something I described above (importing only the neural net as a controller) allow me to use the network as a controller on older versions of MATLAB?

Thanks for the help.

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Arkadiy Turevskiy am 6 Aug. 2021

1
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/893022-how-to-export-and-use-only-the-end-product-of-a-reinforcement-learning-algorithm#answer_762687

Hi there,

To deploy trained RL agent you need to:

Extract trained policy from RL agent. For that you use generatePolicyFunction. As the doc explains, this function will create a function evaluatePolicy.m and agentData.mat file.
To run inference on trained policy in Simulink, use MATLAB Function block in Simulink. In MATLAB Funciton block call evaluatePolicy
You are done! You can now simulate your trained policy in Simulink. Starting with 21a we support ANSI C code gen for deep learning networks. So you can generate ANSI C code from your Simulink model that contains whetever algorithms you have plus trained RL policy represented by MATLAB Function Block.
The generated code should be compatible with any microcontroller, or with a rapid prototyping setup, for example, with speedgoat machine.

To see the details on what layers currently support ANSI C (generic C), please refer to this doc page.

As for the second question, T3D support for LSTMs came in 21a in Reinforcement Learning Toolbox. So as you point out you would not be able to use this agent in 20b. However, you should be able to extract trained policy as described above and run inference on it in 20b. In 21a we added a Simulink block to Deep Learning Toolbox for simulating LSTMs, but I think you should be able to simulate in 20b using MATLAB Function block. If you can;t or reults don't make sense, please reach out to tech support.

3 Kommentare
1 älteren Kommentar anzeigen1 älteren Kommentar ausblenden

Shah Fahad am 14 Nov. 2023

Hello Arkadiy,

Is it possible to export the experiences of the DRL agent as a file such as excel file. So that they can be used in another hardware device such as RTDS.

Thank you.

Emmanouil Tzorakoleftherakis am 15 Nov. 2023

Hi,

What you are asking should be possible, but I would need some additional info to make sure what the best way to proceed would be. For example:

1) It seems RTDS is used for HIL testing - is this what you are trying to accomplish?

2) What do you mean by 'export the experiences of the DRL agent?'? By experiences we usually refer to actions, observations, rewards. How are you going to use all of this? What would be the setup? I am asking because, e.g. for HIL test, you likely don't need all of that.

Also, since this is an old thread, can you maybe create a new questions with some additional details? It helps with discoverability.

Thanks!

Melden Sie sich an, um zu kommentieren.

How to export and use only the end product of a reinforcement learning algorithm ?

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

3 Kommentare
1 älteren Kommentar anzeigen1 älteren Kommentar ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

How to export and use only the end product of a reinforcement learning algorithm ?

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Akzeptierte Antwort

3 Kommentare 1 älteren Kommentar anzeigen1 älteren Kommentar ausblenden

Weitere Antworten (0)

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

3 Kommentare
1 älteren Kommentar anzeigen1 älteren Kommentar ausblenden