How to offload when using multi GPU (NeMo) #71

clairelee5740 · 2025-01-20T09:19:37Z

As in https://github.com/NVIDIA/Cosmos/tree/main/cosmos1/models/diffusion/nemo/inference#run-the-inference-script-with-base-model, I run
NVTE_FUSED_ATTN=0 \ torchrun --nproc_per_node=$NUM_DEVICES cosmos1/models/diffusion/nemo/inference/general.py \ --model Cosmos-1.0-Diffusion-7B-Text2World \ --cp_size $NUM_DEVICES \ --num_devices $NUM_DEVICES \ --video_save_path "Cosmos-1.0-Diffusion-7B-Text2World.mp4" \ --guidance 7 \ --seed 1 \ --prompt "$PROMPT" \ --enable_prompt_upsampler

I got cuda out of memory error, so I tried
NVTE_FUSED_ATTN=0 \ torchrun --nproc_per_node=$NUM_DEVICES cosmos1/models/diffusion/nemo/inference/general.py \ --model Cosmos-1.0-Diffusion-7B-Text2World \ --cp_size $NUM_DEVICES \ --num_devices $NUM_DEVICES \ --video_save_path "Cosmos-1.0-Diffusion-7B-Text2World.mp4" \ --guidance 7 \ --seed 1 \ --prompt "$PROMPT" \ --enable_prompt_upsampler \ --offload_tokenizer \ --offload_diffusion_transformer \ --offload_text_encoder_model \ --offload_prompt_upsampler \ --offload_guardrail_models

but it says
general.py: error: unrecognized arguments: --offload_tokenizer --offload_diffusion_transformer --offload_text_encoder_model --offload_prompt_upsampler --offload_guardrail_models

Did anyone encounter similar question when using NeMo framework? Is there any solution?
btw I am using 4 RTX 4090 24G and there is still not enough GPU memory

The text was updated successfully, but these errors were encountered:

ethanhe42 · 2025-02-04T19:27:37Z

it is currently not supported yet. you can try modify the code to delete models after they're used. e.g. del t5_model

pjannaty assigned ethanhe42 Jan 24, 2025

sophiahhuang added the question Further information is requested label Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to offload when using multi GPU (NeMo) #71

How to offload when using multi GPU (NeMo) #71

clairelee5740 commented Jan 20, 2025 •

edited

Loading

ethanhe42 commented Feb 4, 2025 •

edited

Loading

How to offload when using multi GPU (NeMo) #71

How to offload when using multi GPU (NeMo) #71

Comments

clairelee5740 commented Jan 20, 2025 • edited Loading

ethanhe42 commented Feb 4, 2025 • edited Loading

clairelee5740 commented Jan 20, 2025 •

edited

Loading

ethanhe42 commented Feb 4, 2025 •

edited

Loading