Beantwortet
Procedure to link state path and action path in a DQL critic reinforcement learning agent?
Hello, Some comments on the points you raise above: 1.There are two ways to create the critic network for DQN as you probabl...

etwa 5 Jahre vor | 0

| akzeptiert

Beantwortet
Reinforcement learning DDPG Agent semi active control issue
Hello, This is very open-ended so there could be a lot of ways to improve your setup. My guess is that the issue is very releva...

etwa 5 Jahre vor | 1

| akzeptiert

Beantwortet
Save listener Callback in eps format or any high resolution format
Hello, If you are using R2020b, you can use help rlPlotTrainingResults to recreate the Episode manager plot and save it as y...

etwa 5 Jahre vor | 0

| akzeptiert

Beantwortet
Input normalization using a reinforcement learning DQN agent
Hello, Normalization through the input layers is not supported for RL training. As a workaround, you can scale the observations...

etwa 5 Jahre vor | 1

| akzeptiert

Beantwortet
Export Q-Table from rlAgent
Here is an example load('basicGWQAgent.mat','qAgent') critic = getCritic(qAgent); tableObj = getModel(critic); table = table...

etwa 5 Jahre vor | 1

| akzeptiert

Beantwortet
Replace PI Controller with RL Agent for simple Transfer Function
Please see answer here: https://www.mathworks.com/matlabcentral/answers/779177-ddpg-agent-isn-t-learning-reward-0-for-every-epi...

etwa 5 Jahre vor | 1

| akzeptiert

Beantwortet
DDPG Agent isn't learning (reward 0 for every episode)
The reason why you see 0 rewards is because thw IsDone flag (which is used to terminate episodes early) is immediately set to tr...

etwa 5 Jahre vor | 1

| akzeptiert

Beantwortet
Transient value problem of the variable in reward function of reinforcement learning
You can put the agent block under a triggered subsystem and set it to begin training after 0.06 seconds

etwa 5 Jahre vor | 0

| akzeptiert

Beantwortet
Agent is suddently doing random actions and training diverge
This is normal behavior - one common misconception is that once the reward starts going up, it will remain up. This is not true ...

etwa 5 Jahre vor | 1

| akzeptiert

Beantwortet
Reinforcement Learning does not show that training occurs?
Thanks for the info. I think this is a scaling issue with the plot. The Episode Manager has this option where you can uncheck "Q...

etwa 5 Jahre vor | 0

Beantwortet
Reinforcement Learning Onramp Issue
Please take a look at this answer.

etwa 5 Jahre vor | 0

Beantwortet
Creating Q-table
Did you take a look at this example? It seems to solve a similar problem. If you want to use the provided API to create a custo...

etwa 5 Jahre vor | 0

Beantwortet
Read data from csv file into a reward function for Reinforcement Learning
It seems like you were trying to read the file from within the MATLAB Fcn block (this block assumes that anything you write in i...

etwa 5 Jahre vor | 0

| akzeptiert

Beantwortet
Reinforcement learning : How to define custom environment with multiple image based observations
For grayscale images, take a look at this example. For rgb, maybe the following would work ObservationInfo = rlNumericSpec([320...

etwa 5 Jahre vor | 0

Beantwortet
How to avoid repeated actions and to manually end episode for a DQN agent?
From what you are saying, it seems that training has not converged yet. During training, the agent may every now and then behave...

etwa 5 Jahre vor | 0

Beantwortet
Set gpu option for rlPPOAgent actor
What you have specified is sufficient for the critic. If you do the same for the actor you are all set - there is no additional ...

etwa 5 Jahre vor | 0

Beantwortet
Reward in training manager higher than should be
Cannot be sure about the error, but it seems somewhere in your setup you are currently changing changing the number of parameter...

etwa 5 Jahre vor | 0

Beantwortet
Visualize Progress in Reinforcement Learning Toolbox
This is not possible out of the box, but you could implement something like this by setting a counter and saving the current ve...

etwa 5 Jahre vor | 0

| akzeptiert

Beantwortet
Elements problem due to the deep learning toolbox 'Predict'
Hello, I see the problem. Typically, RL policies have post-processing part, which may vary from agent to agent, and the Predict...

etwa 5 Jahre vor | 0

| akzeptiert

Beantwortet
Using RL, How to train multi-agents such that each agent will navigate from its initial position to goal position avoiding collisions?
It's possible that the scenario you described can be solved by training a single agent, and then "deploying" that trained agent ...

etwa 5 Jahre vor | 0

Beantwortet
Question regarding DDPG PMSM FOC control example
All RL agents in Reinforcement Learning Toolbox operate at fixed discrete-time intervals by default. However, you do not need to...

etwa 5 Jahre vor | 1

| akzeptiert

Beantwortet
Reinforcement Learning Using Action as Time series command
Let me make sure I understand the question: the RL Agent action is to close the gripper or not, but you want this action to be p...

etwa 5 Jahre vor | 0

Beantwortet
C code generation for reinforcement learning agent in Simliunk 2019b
I believe C code generation is not supported in 19b (only C++). Even if C++ works for you, that would require building one of th...

etwa 5 Jahre vor | 0

| akzeptiert

Beantwortet
How to extract neural network of reinforcement learning agent?
For critics: criticNet = getModel(getCritic(trained_agent)) For actors actorNet = getModel(getActor(trained_agent)) Note tha...

etwa 5 Jahre vor | 0

| akzeptiert

Beantwortet
How to set DQN network to approach Q0 ?
There is no single answer here that will get the training to work. My first instict would be to go for a simpler architecture wi...

etwa 5 Jahre vor | 1

| akzeptiert

Beantwortet
Is it possible 'LSTM net' to be the env of reinforcement learning?
Hello, Looks like your environment is in MATLAB (i.e. not in Simulink). There is not restriction on using an LSTM as an environ...

etwa 5 Jahre vor | 0

| akzeptiert

Beantwortet
How can we reduce the overshoot in a controller trained with reinforcement learning while it is tracking a square wave.
Hello, It's all about the reward signal in RL. It's not like with PIDs where you can play with gains and you know from theory w...

etwa 5 Jahre vor | 0

Gesendet


Playing PongĀ® with deep reinforcement learning
Train a reinforcement learning agent to play a variation of PongĀ®

etwa 5 Jahre vor | 3 Downloads |

0.0 / 5
Thumbnail

Beantwortet
Enforce action space constraints within the environment
If the environment is in Simulink, you can setup scopes and observe what's happening during training. If the environment is in M...

etwa 5 Jahre vor | 0

| akzeptiert

Beantwortet
why I get a different action result every new time with same sample observations after deploying trained RL policies?
Which agent are you using? Some agents are stochastic, meaning that the output is sampled based on probability distributions so ...

etwa 5 Jahre vor | 0

| akzeptiert

Mehr laden