-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cuda out of memory issue #15
Comments
That is very odd, what resolution do the images have? Can you maybe share the log of what parameters you used? |
Sorry for the late reply, My config was: original camera parameterscam: We calibrate the camera once in prgbd mode without any scale optimization, which roughly gets the right parametersfx: 275 # heuristic: 1296.0 And my running script: python run.py data=Custom/hd.yaml |
Sorry for the late reply, My config was: cam: fx: 275 # heuristic: 1296.0 And my running script: python run.py data=Custom/hd.yaml |
Hey, Your resolution is not too big, you dont have a lot of images and you dont seem to have a lot of Gaussians either, I dont really understand why this OOM would happen. Can you give more info where/when this is triggered? Could you try to run the SLAM system without backend by setting |
Hi, I've met with cuda oom issue even with a small dataset 126 images. And I use the mcmc gaussian splatting and set cap_max=150,000 to reduce memory footprint. But the process on my A100 GPU crashed with OOM error.
| File "/aistudio/workspace/system-default/envs/droidsplat/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap |
-- | -- | --
| | 2024-12-13 16:15:09.130 | self.run() |
| | 2024-12-13 16:15:09.130 | File "/aistudio/workspace/system-default/envs/droidsplat/lib/python3.10/multiprocessing/process.py", line 108, in run |
| | 2024-12-13 16:15:09.130 | self._target(*self._args, **self._kwargs) |
| | 2024-12-13 16:15:09.130 | File "/aistudio/workspace/aigc/wangqihang013/aigc3d/repos/neural_rendering/sfm/DROID-Splat/src/slam.py", line 310, in tracking |
| | 2024-12-13 16:15:09.130 | self.frontend(timestamp, image, depth, intrinsic, gt_pose, static_mask=static_mask) |
| | 2024-12-13 16:15:09.130 | File "/aistudio/workspace/system-default/envs/droidsplat/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl |
| | 2024-12-13 16:15:09.130 | return self._call_impl(*args, **kwargs) |
| | 2024-12-13 16:15:09.130 | File "/aistudio/workspace/system-default/envs/droidsplat/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl |
| | 2024-12-13 16:15:09.130 | return forward_call(*args, **kwargs) |
| | 2024-12-13 16:15:09.130 | File "/aistudio/workspace/system-default/envs/droidsplat/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context |
| | 2024-12-13 16:15:09.130 | return func(*args, **kwargs) |
| | 2024-12-13 16:15:09.130 | File "/aistudio/workspace/aigc/wangqihang013/aigc3d/repos/neural_rendering/sfm/DROID-Splat/src/frontend.py", line 39, in forward |
| | 2024-12-13 16:15:09.130 | self.optimizer() # Local Bundle Adjustment |
| | 2024-12-13 16:15:09.130 | File "/aistudio/workspace/aigc/wangqihang013/aigc3d/repos/neural_rendering/sfm/DROID-Splat/src/frontend.py", line 220, in call |
| | 2024-12-13 16:15:09.130 | self.__update() |
| | 2024-12-13 16:15:09.130 | File "/aistudio/workspace/aigc/wangqihang013/aigc3d/repos/neural_rendering/sfm/DROID-Splat/src/frontend.py", line 100, in __update |
| | 2024-12-13 16:15:09.130 | self.graph.rm_factors(self.graph.age > self.max_age, store=True) |
| | 2024-12-13 16:15:09.130 | File "/aistudio/workspace/system-default/envs/droidsplat/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast |
| | 2024-12-13 16:15:09.130 | return func(*args, **kwargs) |
| | 2024-12-13 16:15:09.130 | File "/aistudio/workspace/aigc/wangqihang013/aigc3d/repos/neural_rendering/sfm/DROID-Splat/src/factor_graph.py", line 178, in rm_factors |
| | 2024-12-13 16:15:09.130 | self.corr = self.corr[~mask] |
| | 2024-12-13 16:15:09.130 | File "/aistudio/workspace/aigc/wangqihang013/aigc3d/repos/neural_rendering/sfm/DROID-Splat/src/modules/corr.py", line 72, in getitem |
| | 2024-12-13 16:15:09.130 | self.corr_pyramid[i] = self.corr_pyramid[i][index] |
| | 2024-12-13 16:15:09.130 | torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 334.00 MiB. GPU 0 has a total capacty of 39.42 GiB of which 289.06 MiB is free. Process 48412 has 31.97 GiB memory in use. Process 65823 has 2.55 GiB memory in use. Process 67515 has 1.78 GiB memory in use. Process 68107 has 416.00 MiB memory in use. Process 69865 has 2.02 GiB memory in use. Process 70462 has 416.00 MiB memory in use. Of the allocated memory 652.43 MiB is allocated by PyTorch, and 647.57 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
The text was updated successfully, but these errors were encountered: