Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Help] torch.compile(model.forward, mode="reduce-overhead", fullgraph=True) 出错 #1496

Open
1 task done
lilxmx opened this issue Nov 7, 2024 · 0 comments
Open
1 task done

Comments

@lilxmx
Copy link

lilxmx commented Nov 7, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

我在根据transformers官方的教程进行推理加速。 https://huggingface.co/docs/transformers/llm_optims?static-kv=basic+usage%3A+generation_config

model.generation_config.cache_implementation = "static"
model.forward = torch.compile(model.forward, mode="reduce-overhead", fullgraph=True)
image

Expected Behavior

No response

Steps To Reproduce

with torch.no_grad():
vector_outputs = model(
**seq, output_hidden_states=True, return_dict=True
)
在model()运行时报错,没有进入下一层,直接报错

Environment

- OS:
- Python:3.8.20
- Transformers: 4.30.2
- PyTorch:2.0.1 
- CUDA Support:True

Anything else?

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant