You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Trying to run this in Win10 WSL2 on a 3080TI /w 12gb VRAM. Setting the offload_per_layer=7 does not seem to help, VRAM memory usage never goes above 6.5gb so there seems to be lots of room available.
/home/mrnova/.conda/envs/mixtral/lib/python3.10/site-packages/torch/nn/init.py:412: UserWarning: Initializing zero-element tensors is a no-op
warnings.warn("Initializing zero-element tensors is a no-op")
Traceback (most recent call last):
File "/home/mrnova/mixtral-offloading/main.py", line 54, in <module>
model = build_model(
File "/home/mrnova/mixtral-offloading/src/build_model.py", line 204, in build_model
expert_cache = ExpertCache(
File "/home/mrnova/mixtral-offloading/src/expert_cache.py", line 67, in __init__
self.offloaded_storages = [
File "/home/mrnova/mixtral-offloading/src/expert_cache.py", line 68, in <listcomp>
torch.UntypedStorage(self.module_size).pin_memory(self.device) for _ in range(offload_size)]
File "/home/mrnova/.conda/envs/mixtral/lib/python3.10/site-packages/torch/storage.py", line 226, in pin_memory
cast(Storage, self)).pin_memory(device)
RuntimeError: CUDA error: out of memory
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Trying to run this in Win10 WSL2 on a 3080TI /w 12gb VRAM. Setting the offload_per_layer=7 does not seem to help, VRAM memory usage never goes above 6.5gb so there seems to be lots of room available.
The text was updated successfully, but these errors were encountered: