Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very slow to load deep seekv3 int4 model and device_map="auto" "sequential" bug #35522

Open
wenhuach21 opened this issue Jan 6, 2025 · 0 comments
Labels

Comments

@wenhuach21
Copy link

wenhuach21 commented Jan 6, 2025

System Info

transforms 4.47.0

Who can help?

No response

Reproduction

please refer the code in model card https://huggingface.co/OPEA/DeepSeek-V3-int4-sym-gptq-inc

Expected behavior

1 Loading is very slow. Loading the model (https://huggingface.co/OPEA/DeepSeek-V3-int4-sym-gptq-inc) on a DGX system with 2TB of memory and 7x80GB A100 GPUs is very slow, taking 30 minutes to 1 hour.

2 device_map bug. Additionally, on a 7x80GB A100 GPUs, using device_map='auto' results in an OOM error, while switching to device_map='sequential' still causes an OOM error on card 0, even with max_memory configured.

@wenhuach21 wenhuach21 added the bug label Jan 6, 2025
@wenhuach21 wenhuach21 changed the title Very slow to load deep seekv3 int4 model and device_map="auto" and "sequential" bug Very slow to load deep seekv3 int4 model and device_map="auto" "sequential" bug Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant