Very slow to load deep seekv3 int4 model and device_map="auto" "sequential" bug #35522

wenhuach21 · 2025-01-06T02:07:15Z

System Info

transforms 4.47.0

Who can help?

No response

Reproduction

please refer the code in model card https://huggingface.co/OPEA/DeepSeek-V3-int4-sym-gptq-inc

Expected behavior

1 Loading is very slow. Loading the model (https://huggingface.co/OPEA/DeepSeek-V3-int4-sym-gptq-inc) on a DGX system with 2TB of memory and 7x80GB A100 GPUs is very slow, taking 30 minutes to 1 hour.

2 device_map bug. Additionally, on a 7x80GB A100 GPUs, using device_map='auto' results in an OOM error, while switching to device_map='sequential' still causes an OOM error on card 0, even with max_memory configured.

wenhuach21 added the bug label Jan 6, 2025

wenhuach21 changed the title ~~Very slow to load deep seekv3 int4 model and device_map="auto" and "sequential" bug~~ Very slow to load deep seekv3 int4 model and device_map="auto" "sequential" bug Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Very slow to load deep seekv3 int4 model and device_map="auto" "sequential" bug #35522

Very slow to load deep seekv3 int4 model and device_map="auto" "sequential" bug #35522

wenhuach21 commented Jan 6, 2025 •

edited

Loading

Very slow to load deep seekv3 int4 model and device_map="auto" "sequential" bug #35522

Very slow to load deep seekv3 int4 model and device_map="auto" "sequential" bug #35522

Comments

wenhuach21 commented Jan 6, 2025 • edited Loading

System Info

Who can help?

Reproduction

Expected behavior

wenhuach21 commented Jan 6, 2025 •

edited

Loading