getExplorationPolicy

Extract exploratory (stochastic) policy object from agent

Since R2023a

Syntax

policy = getExplorationPolicy(agent)

Description

policy = getExplorationPolicy(agent) returns a stochastic policy object from the specified reinforcement learning agent. Stochastic polices are useful for exploration.

example

Examples

collapse all

Extract Policy Object from Agent

Open Live Script

For this example, load the PG agent trained in Train PG Agent to Balance Discrete Cart-Pole System.

load("MATLABCartpolePG.mat","agent")

Extract the agent greedy policy using getGreedyPolicy.

policyDtr = getGreedyPolicy(agent)

policyDtr = 
  rlStochasticActorPolicy with properties:

                     Actor: [1×1 rl.function.rlDiscreteCategoricalActor]
    UseMaxLikelihoodAction: 1
             Normalization: "none"
           ObservationInfo: [1×1 rl.util.rlNumericSpec]
                ActionInfo: [1×1 rl.util.rlFiniteSetSpec]
                SampleTime: 1

Note that, in the extracted policy object, the UseMaxLikelihoodAction property is set to true. This means that the policy object always generates the maximum likelihood action in response to a given observation, and is therefore greedy (and deterministic).

Alternatively, you can extract a stochastic policy using getExplorationPolicy.

policyXpl = getExplorationPolicy(agent)

policyXpl = 
  rlStochasticActorPolicy with properties:

                     Actor: [1×1 rl.function.rlDiscreteCategoricalActor]
    UseMaxLikelihoodAction: 0
             Normalization: "none"
           ObservationInfo: [1×1 rl.util.rlNumericSpec]
                ActionInfo: [1×1 rl.util.rlFiniteSetSpec]
                SampleTime: 1

This time, the extracted policy object has the UseMaxLikelihoodAction property is set to false. This means that the policy object generates a random action, given an observation. The policy is therefore stochastic and useful for exploration.

Input Arguments

collapse all

`agent` — Reinforcement learning agent
reinforcement learning agent object

Reinforcement learning agent that contains a critic, specified as one of the following objects:

rlQAgent
rlSARSAAgent
rlLSPIAgent
rlDQNAgent
rlPGAgent (when using a critic to estimate a baseline value function)
rlDDPGAgent
rlTD3Agent
rlACAgent
rlSACAgent
rlPPOAgent
rlTRPOAgent
rlMBPOAgent

Note

if agent is an rlMBPOAgent object, to extract the greedy policy, use getGreedyPolicy(agent.BaseAgent).

Output Arguments

collapse all

`policy` — Reinforcement learning policy object
`rlEpsilonGreedyPolicy` object | `rlAdditiveNoisePolicy` object | `rlStochasticActorPolicy` object

Policy object, returned as one of the following:

rlEpsilonGreedyPolicy object — Returned when agent is an rlQAgent, rlSARSAAgent, or rlDQNAgent object.
rlAdditiveNoisePolicy object — Returned when agent is an rlDDPGAgent or rlTD3Agent object.
rlStochasticActorPolicy object, with the UseMaxLikelihoodAction set to false — Returned when agent is an rlACAgent, rlPGAgent, rlPPOAgent, rlTRPOAgent or rlSACAgent object. Since the returned policy object has the UseMaxLikelihoodAction property set to false, it always generates a random action (according to the policy probability distribution) as a response to a given observation, and is therefore exploratory (and stochastic).
rlHybridStochasticActorPolicy object, with the UseMaxLikelihoodAction set to false — Returned when agent is an rlSACAgent object. Since the returned policy object has the UseMaxLikelihoodAction property set to false, it always generates a random action (according to the policy probability distribution) as a response to a given observation, and is therefore exploratory (and stochastic).

Version History

Introduced in R2023a

getExplorationPolicy

Syntax

Description

Examples

Extract Policy Object from Agent

Input Arguments

`agent` — Reinforcement learning agent
reinforcement learning agent object

Output Arguments

`policy` — Reinforcement learning policy object
`rlEpsilonGreedyPolicy` object | `rlAdditiveNoisePolicy` object | `rlStochasticActorPolicy` object

Version History

See Also

Functions

Objects

Blocks

Topics

getExplorationPolicy

Syntax

Description

Examples

Extract Policy Object from Agent

Input Arguments

agent — Reinforcement learning agent reinforcement learning agent object

Output Arguments

policy — Reinforcement learning policy object rlEpsilonGreedyPolicy object | rlAdditiveNoisePolicy object | rlStochasticActorPolicy object

Version History

See Also

Functions

Objects

Blocks

Topics

`agent` — Reinforcement learning agent
reinforcement learning agent object

`policy` — Reinforcement learning policy object
`rlEpsilonGreedyPolicy` object | `rlAdditiveNoisePolicy` object | `rlStochasticActorPolicy` object