How to let the reinforcement learning agent know exactly what action it takes?

Question

Aaron Bramhasta am 5 Nov. 2024 um 17:06

0
Verknüpfen

Direkter Link zu dieser Frage

https://de.mathworks.com/matlabcentral/answers/2164170-how-to-let-the-reinforcement-learning-agent-know-exactly-what-action-it-takes

Beantwortet: Maneet Kaur Bagga am 21 Nov. 2024 um 6:52

Model.zip

Dear Matlab Experts,

I am currently running a reinforcement learning simulation, integrated with a discrete events system of simulink. My main simulation of the discrete events utilizes bus element containing multiple entites that some will serve as an observation for the RL agent (via conversion entity -> signal) and to impose the action the RL agent chooses (via conversion signal -> entity). I imposed some policy in the DES where given a certain requirements, the entity value will be assigned, that will switch an entity gate to determine which course of action to take. However, my reinforcement learning agent does not seem to understand this rule, as it assigns the entity value randomly from the values available. Is there a way to apply this rule that is present in the DES, to somehow make the same rule understandable by the RL agent?

Thank you so much in advance! I am attaching my model for reference.

Best regards,

Aaron.

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Answer 1

Maneet Kaur Bagga am 21 Nov. 2024 um 6:52

0
Verknüpfen

Direkter Link zu dieser Antwort

https://de.mathworks.com/matlabcentral/answers/2164170-how-to-let-the-reinforcement-learning-agent-know-exactly-what-action-it-takes#answer_1547838

In MATLAB Online öffnen

Hi,

As per my understanding, the issue is encountered because your DES contains specific policies, such as switching gates based on entity attributes. These rules are likely hard-coded and not inherently part of the RL environment's observation or reward structure. The RL agent explores actions based on the provided observations and the learned policy.

Please refer to the following workaround for the same:

Incorporate Rule into Observations: Add flags or variables that indicate the rule's state (e.g "Gate should switch" = 1/0). Ensure these conditions are dynamically updated during simulation.

Augment Reward Structure: Add a penalty or reward for actions that align with or violate the DES rules. This encourages the RL agent to learn behaviors aligned with the rules.

reward = reward + (agentAction == expectedAction) * rewardFactor;

Pretrain the Agent: Use supervised learning to pretrain the RL agent to follow the DES rules as a baseline policy. Later, fine-tune with reinforcement learning.

Custom Environment Dynamics: Modify the environment (DES model) such that the DES rules are enforced during interaction. For instance, override the agent’s selected action if it violates a rule.

if violatesRule(action, currentState)
    action = enforceRule(currentState);
end

Regularization: Include constraints in the training process that mimic the DES rules. For example, ensure that the policy network outputs actions adhering to the rules.

loss = loss + ruleViolationPenalty * countViolations(actions, state);

Rule-Based Hybrid Approach: Use "rlAgent.getAction" to test the agent's action in specific scenarios and compare it against the DES policy to identify mismatches.

Please refer to the following MathWorks documentation of "rlAcAgent.getAction" for better understanding:

https://in.mathworks.com/help/reinforcement-learning/ref/rl.agent.rlacagent.html

Hope this helps!

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

How to let the reinforcement learning agent know exactly what action it takes?

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Antworten (1)

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

How to let the reinforcement learning agent know exactly what action it takes?

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Antworten (1)

0 Kommentare -2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

Siehe auch

Kategorien

Tags

Produkte

Version

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden

0 Kommentare
-2 ältere Kommentare anzeigen-2 ältere Kommentare ausblenden