Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timing scaden train on cpu vs gpu #101

Closed
nagendraKU opened this issue Jul 2, 2021 · 2 comments
Closed

Timing scaden train on cpu vs gpu #101

nagendraKU opened this issue Jul 2, 2021 · 2 comments

Comments

@nagendraKU
Copy link

I am running scaden train on a cluster node with a Tesla V100 GPU, but (on casual observation) I don't see a time difference in the training when the GPU is enabled or disabled.

I do get the following message when the GPU is disabled, so it looks like scaden can "see" the GPU ? I have tensorflow-gpu installed.

INFO Training M256 Model ... train.py:54
2021-07-02 15:14:22.035782: E tensorflow/stream_executor/cuda/cuda_driver.cc:328] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected

Is there a way to check if the GPU is being used by scaden ? And at a practical level, is it just simpler to let scaden train run on a 40 core CPU than getting the GPU part to work ?

@KevinMenden
Copy link
Owner

Hi @nagendraKU ,

it looks like there is some issue with the CUDA installation and it somehow can't connect to it. That can happen for various reasons - hard to tell from here!

But from a practical aspect, yes you're right :) I think you'll be just fine with your 40 core CPU, training should not take too long anyway. So in that case it might not be worth the effort to get the GPU running. It's not a huge model!

Cheers,
Kevin

@nagendraKU
Copy link
Author

Thanks for the input Kevin !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants