Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaN Loss on train #13

Open
bglick13 opened this issue Jul 15, 2016 · 4 comments
Open

NaN Loss on train #13

bglick13 opened this issue Jul 15, 2016 · 4 comments

Comments

@bglick13
Copy link

I have been getting NaN losses when I try to train my agent on a game. I tracked it back to the get_batch function in memory - Y (the output of the model's predictions) turns to all NaNs about halfway through the first epoch. I haven't been able to figure it out from there, though.

Any suggestion would be much appreciated. This package is fantastic!

@farizrahman4u
Copy link
Owner

Please specify the model you are using.

@bglick13
Copy link
Author

Hi, thank you for the response.

I am using essentially the example from the read me, but with my own game. I figured out if I lower the learning rate dramatically to .001, it fixes the problem. Could there be something in the way I've designed my game that would cause the discrepancy? I'd rather use a slightly larger learning rate if possible.

Thanks again

@farizrahman4u
Copy link
Owner

farizrahman4u commented Jul 16, 2016

Some sort of normalization of your reward might help. Paste your code here, I will take a look.

@IntQuant
Copy link

IntQuant commented Mar 9, 2017

I found that in logs before loss became nan

Epoch 143/1000 | Loss 2214.4102 | Epsilon 0.00 | Win count 65
Epoch 144/1000 | Loss 6051275231349243379712.0000 | Epsilon 0.00 | Win count 66
Epoch 145/1000 | Loss 7.3589 | Epsilon 0.00 | Win count 67
Epoch 146/1000 | Loss 11.0253 | Epsilon 0.00 | Win count 68
Epoch 147/1000 | Loss 33.1732 | Epsilon 0.00 | Win count 68
Epoch 148/1000 | Loss 32.7043 | Epsilon 0.00 | Win count 68
Epoch 149/1000 | Loss 3.5222 | Epsilon 0.00 | Win count 69
/usr/local/lib/python3.5/dist-packages/qlearning4k/memory.py:56: RuntimeWarning: invalid value encountered in multiply
targets = (1 - delta) * Y[:batch_size] + delta * (r + gamma * (1 - game_over) * Qsa)
Epoch 150/1000 | Loss nan | Epsilon 0.00 | Win count 69

code.py.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants