allExperiences
Description
returns all experiences stored in experience buffer experiences = allExperiences(buffer)buffer as
individual experiences, each with a batch size of 1 and a sequence length of 1.
specifies the type and concatenation of the fields in experiences = allExperiences(buffer,Name=Value)experience using
one or more name-value pair arguments. You can specify whether to return the experiences as
dlarray objects or whether to store them in the GPU. You can also return
experiences concatenated along the batch dimension or the sequence dimension.
Examples
Define observation specifications for the environment. For this example, assume that the environment has two observation channels: one channel with two continuous observations and one channel with a three-valued discrete observation.
obsContinuous = rlNumericSpec([2 1],... LowerLimit=0,... UpperLimit=[1;5]); obsDiscrete = rlFiniteSetSpec([1 2 3]); obsInfo = [obsContinuous obsDiscrete];
Define action specifications for the environment. For this example, assume that the environment has a single action channel with one continuous action in a specified range.
actInfo = rlNumericSpec([2 1],... LowerLimit=0,... UpperLimit=[5;10]);
Create an experience buffer with a maximum length of 5000.
buffer = rlReplayMemory(obsInfo,actInfo,5000);
Append a sequence of 10 random experiences to the buffer.
for i = 1:10 experience(i).Observation = ... {obsInfo(1).UpperLimit.*rand(2,1) randi(3)}; experience(i).Action = {actInfo.UpperLimit.*rand(2,1)}; experience(i).NextObservation = ... {obsInfo(1).UpperLimit.*rand(2,1) randi(3)}; experience(i).Reward = 10*rand(1); experience(i).IsDone = 0; end append(buffer,experience);
After appending experiences to the buffer, you extract all of the experiences from the buffer. Extract all of the experiences as individual experiences, each with a batch size of 1 and sequence size of 1.
experience = allExperiences(buffer)
experience=10×1 struct array with fields:
Observation
Action
NextObservation
Reward
IsDone
Alternatively, you can extract all of the experiences as a single experience batch.
expBatch = allExperiences(buffer,ConcatenateMode="batch")expBatch = struct with fields:
Observation: {[2×1×10 double] [1×1×10 double]}
Action: {[2×1×10 double]}
Reward: [9.5751 9.1574 7.4313 8.2346 1.8687 1.6261 5.0596 2.5428 3.5166 5.6782]
NextObservation: {[2×1×10 double] [1×1×10 double]}
IsDone: [0 0 0 0 0 0 0 0 0 0]
Input Arguments
Experience buffer, specified as one of the following replay memory objects.
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN, where Name is
the argument name and Value is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Example: mode="batch"
Concatenation mode specified as a one of the following values.
"none"— Return experience as N individual experiences, each with a batch size of 1 and a sequence length of 1."batch"— Return experience as a single batch with a sequence length of 1."sequence"— Return experience as a single sequence with a batch size of 1.
Option to return output as deep learning array, specified as a logical value. When
you specify ReturnDlarray as true the fields
of experience are dlarray objects.
Example: ReturnDlarray=true
Option to return output as GPU array, specified as a logical value. When you
specify ReturnGPUarray as true the fields of
experience are stored in the GPU.
Setting this option to true requires both Parallel Computing Toolbox™ software and a CUDA® enabled NVIDIA® GPU. For more information on supported GPUs see GPU Computing Requirements (Parallel Computing Toolbox).
You can use gpuDevice (Parallel Computing Toolbox) to query or select a local GPU device to be
used with MATLAB®.
Example: ReturnGpuArray=true
Output Arguments
All N buffered experiences, returned as a structure array or
structure. When mode is:
"none",experienceis returned as a structure array of length N, where each element contains one buffered experience (batchSize=1andSequenceLength=1)."batch",experienceis returned as a structure. Each field ofexperiencecontains all buffered experiences concatenated along the batch dimension (batchSize= N andSequenceLength=1)."sequence",experienceis returned as a structure. Each field ofexperiencecontains all buffered experiences concatenated along the batch dimension (batchSize=1andSequenceLength= N).
experience contains the following fields.
Observation, returned as a cell array with length equal to the number of
observation specifications specified when creating the buffer. Each element of
Observation contains a
DO-by-batchSize-by-SequenceLength
array, where DO is the dimension of the
corresponding observation specification.
Agent action, returned as a cell array with length equal to the number of
action specifications specified when creating the buffer. Each element of
Action contains a
DA-by-batchSize-by-SequenceLength
array, where DA is the dimension of the
corresponding action specification.
Reward value obtained by taking the specified action from the observation,
returned as a 1-by-1-by-SequenceLength array.
Next observation reached by taking the specified action from the observation,
returned as a cell array with the same format as
Observation.
Termination signal, returned as a
1-by-1-by-SequenceLength array of integers. Each element of
IsDone has one of the following values.
0— This experience is not the end of an episode.1— The episode terminated because the environment generated a termination signal.2— The episode terminated by reaching the maximum episode length.
Version History
Introduced in R2022b
See Also
Functions
Objects
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Website auswählen
Wählen Sie eine Website aus, um übersetzte Inhalte (sofern verfügbar) sowie lokale Veranstaltungen und Angebote anzuzeigen. Auf der Grundlage Ihres Standorts empfehlen wir Ihnen die folgende Auswahl: .
Sie können auch eine Website aus der folgenden Liste auswählen:
So erhalten Sie die bestmögliche Leistung auf der Website
Wählen Sie für die bestmögliche Website-Leistung die Website für China (auf Chinesisch oder Englisch). Andere landesspezifische Websites von MathWorks sind für Besuche von Ihrem Standort aus nicht optimiert.
Amerika
- América Latina (Español)
- Canada (English)
- United States (English)
Europa
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)