dqn_agent.DqnAgent: epsilon_greedy not modifiable during training #827

fede72bari · 2023-03-08T08:40:03Z

It seems it is not possible to define a variable decay epsilon during training for dqn_agent.DqnAgent. I have some direct and indirect evidence of this:

I tried to define a custom decay() function and to link the epsilon_greedy to this function at the moment of the agent instantiation; printing the epsilon_greedy value from time to time during the training I can see that it remains at the starting value 1 even if the decay() as if the epsilon_greedy parameter was not callable
therefore, I forced to change the epsilong_greedy value externally during the training process with agent._epsilon_greedy = decay(step); printing the agent._epsilon_greedy I can see that the parameter is really changing but the behavior is again as if the policy was completely random keeping the initial value 1
as counter verification, I used the development of point 1 starting the decay function from 0.011 and ending with a constant epsilon equal to 0.01. In this case, I have the same behavior of training that I get when I set a constant epsilon = 0.01 i.e. it is keeping the initial 0.011

The conclusion I get is that the randomosity of the policy cannot be changed/updated during the training by changing the hyperparameter _epsilon_greedy nor it can be changed by linking the hyperparameter to a custom decay function. Is it true? In case, how can be defined a variable decay epsilon?

fede72bari changed the title ~~epsilon_greedy not modifiable during training~~ dqn_agent.DqnAgent: epsilon_greedy not modifiable during training Mar 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dqn_agent.DqnAgent: epsilon_greedy not modifiable during training #827

dqn_agent.DqnAgent: epsilon_greedy not modifiable during training #827

fede72bari commented Mar 8, 2023 •

edited

Loading

dqn_agent.DqnAgent: epsilon_greedy not modifiable during training #827

dqn_agent.DqnAgent: epsilon_greedy not modifiable during training #827

Comments

fede72bari commented Mar 8, 2023 • edited Loading

fede72bari commented Mar 8, 2023 •

edited

Loading