Changing how DQN agent explores
2 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Hi,
I'm using a DQN agent with epsilon-greedy exploration. The problem is that my agent sees state 1 99% of the time, so it never learns to act in other states. By the time it learns to get to state 2 from state 1, epsilon has already decayed significantly and the agent gets stuck taking a sub-optimal action in state 2. Is there a way to implement some other form of exploration, like using a Boltzmann distribution? Thanks for your time.
2 Kommentare
Tanay Gupta
am 13 Jul. 2021
Can you give a brief description of the states and the respective transitions?
Antworten (0)
Siehe auch
Kategorien
Mehr zu Training and Simulation finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!