Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How much VRAM is needed? #17

Open
filipcoja opened this issue Feb 15, 2021 · 1 comment
Open

How much VRAM is needed? #17

filipcoja opened this issue Feb 15, 2021 · 1 comment

Comments

@filipcoja
Copy link

filipcoja commented Feb 15, 2021

I created a collab to use this project with the Googles GPU (I think it comes with a Tesla T4 ~ 16GB VRAM) and I still get OORM errors, any ideas or is it just not possible to run this on a GPU?

https://colab.research.google.com/drive/14zi-3LVOhp6_DQZPl0tW-7H4U3RDUNIK#scrollTo=Sr6EibfN97Xo

@Twizwei
Copy link

Twizwei commented Apr 5, 2021

Same here. I tested test_reflection.py with an NVIDIA V100 16GB GPU, and I changed nn_opts['gpu_devices'] = ['/device:CPU:0'], nn_opts['controller'] = '/device:CPU:0' to nn_opts['gpu_devices'] = ['/device:GPU:0'], nn_opts['controller'] = '/device:GPU:0'. Also I overwrote os.environ["CUDA_VISIBLE_DEVICES"] to '0'. This resulted in a OOM issue.

Below is my error message.

Limit:                 68719476736
InUse:                  9660434944
MaxInUse:               9660434944
NumAllocs:                    2676
MaxAllocSize:            668467200

2021-04-05 16:05:34.480915: W tensorflow/core/common_runtime/bfc_allocator.cc:279] ******************_************************************__**************************__*******___*_***
2021-04-05 16:05:34.480937: W tensorflow/core/framework/op_kernel.cc:1275] OP_REQUIRES failed at gpu_swapping_kernels.cc:43 : Resource exhausted: OOM when allocating tensor with shape[40,32,272,480] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator cuda_host_bfc
2021-04-05 16:05:34.558797: E tensorflow/stream_executor/cuda/cuda_driver.cc:965] failed to alloc 17179869184 bytes on host: CUDA_ERROR_INVALID_VALUE
2021-04-05 16:05:34.558835: W ./tensorflow/core/common_runtime/gpu/cuda_host_allocator.h:40] could not allocate pinned host memory of size: 17179869184
2021-04-05 16:05:34.636627: E tensorflow/stream_executor/cuda/cuda_driver.cc:965] failed to alloc 17179869184 bytes on host: CUDA_ERROR_INVALID_VALUE
2021-04-05 16:05:34.636665: W ./tensorflow/core/common_runtime/gpu/cuda_host_allocator.h:40] could not allocate pinned host memory of size: 17179869184

I guess it's because the size of test image is a little large (1920 x 1080). I know we can still get results by running code on CPUs, but it's really slow. Is there a way to accelerate the testing? Any helps would be appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants