Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llama-Vision: Enable tracing, refactor generation code #15005

Merged
merged 19 commits into from
Nov 14, 2024

Conversation

cglagovichTT
Copy link
Contributor

@cglagovichTT cglagovichTT commented Nov 13, 2024

Ticket

#14519

What's changed

This PR changes the Llama-Vision interface to make it easier to add batch>1 inference, continuous batching, and vLLM integration.

  • Refactored Llama-Vision demos
    • Implemented prefill/decode wrapper in vision_generator.py
    • Use new generator wrapper in all demos
    • Added simple_vision_demo.py for easy testing and e2e perf measurement
  • Refactored Llama cross attention tests
    • Added support for batch>1 xattn cache generation
  • Enable tracing in Llama-Vision

Checklist

…ommit goes back to nlp_tms to create/concat heads.
…porting mask shapes required by non-causal FlashDecode
…tch > 1. WIP, since these changes have now broken the full model and demos
…reation and device tensor transformations. Enabled tracing in simple_vision_demo with an easy trace function
Copy link
Member

@ayerofieiev-tt ayerofieiev-tt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't forget to update PR title

@cglagovichTT cglagovichTT changed the title Cglagovich/14519 noopt Llama-Vision: Enable tracing, refactor generation code Nov 14, 2024
Copy link
Contributor

@mtairum mtairum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An interface matching the one's from Meta's Llama is great!

@cglagovichTT cglagovichTT merged commit 758f8c9 into main Nov 14, 2024
149 of 152 checks passed
@cglagovichTT cglagovichTT deleted the cglagovich/14519_noopt branch November 14, 2024 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants