Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: Upgrade to 24.07, TRT-LLM 0.11.0, and Triton CLI v0.0.10 #81

Merged
merged 7 commits into from
Aug 6, 2024

Conversation

rmccorm4
Copy link
Collaborator

@rmccorm4 rmccorm4 commented Aug 5, 2024

  • Bumps Triton version to 24.07
  • Bumps TRT-LLM version to 0.11.0
    • Updates model config templates
    • Updated template parser to account for new fields and move more sensible defaults into the parser rather than remembering to modify the templates each release
  • Bumps Triton CLI version to 0.0.10
  • Adds minor improvement to triton infer to include inputs in output for easier dev/debugging

Tested gpt2 locally in 24.07 TRT-LLM container. Will see if llama/opt models pass CI pipeline.

@rmccorm4 rmccorm4 requested a review from nvda-mesharma as a code owner August 5, 2024 19:15
@rmccorm4 rmccorm4 requested a review from KrishnanPrash August 5, 2024 19:17
@rmccorm4 rmccorm4 changed the title Upgrade to 24.07, TRT-LLM 0.11.0, and Triton CLI v0.0.10 build: Upgrade to 24.07, TRT-LLM 0.11.0, and Triton CLI v0.0.10 Aug 5, 2024
KrishnanPrash
KrishnanPrash previously approved these changes Aug 5, 2024
Copy link
Contributor

@KrishnanPrash KrishnanPrash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rmccorm4
Copy link
Collaborator Author

rmccorm4 commented Aug 5, 2024

Tested TRT-LLM locally for sanity check while pipelines are running:

export IMAGE_KIND=TRTLLM
export TRTLLM_MODEL="llama-3-8b-instruct"
pytest -s -v
=== 51 passed, 4 skipped in 123.80s (0:02:03) ===

@rmccorm4
Copy link
Collaborator Author

rmccorm4 commented Aug 6, 2024

Pipeline 17230346 passed. If the linked PR triggered by this PR fails, it has been due to some strange node allocation timeout issues.

@rmccorm4 rmccorm4 requested a review from KrishnanPrash August 6, 2024 17:30
@rmccorm4 rmccorm4 merged commit a050ec1 into main Aug 6, 2024
4 of 6 checks passed
@rmccorm4 rmccorm4 deleted the rmccormick-24.07 branch August 6, 2024 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants