From a9af5f42aff287c52b551dbfe959be28d947d7be Mon Sep 17 00:00:00 2001 From: Olga Andreeva <124622579+oandreeva-nv@users.noreply.github.com> Date: Mon, 12 Feb 2024 17:06:07 -0800 Subject: [PATCH 1/3] Update README and versions for 24.02 branch (#33) --- README.md | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index 6a5a6d4f..1ce83f89 100644 --- a/README.md +++ b/README.md @@ -28,12 +28,6 @@ [![License](https://img.shields.io/badge/License-BSD3-lightgrey.svg)](https://opensource.org/licenses/BSD-3-Clause) -**LATEST RELEASE: You are currently on the main branch which tracks -under-development progress towards the next release. The current release branch -is [r24.01](https://github.com/triton-inference-server/vllm_backend/tree/r24.01) -and which corresponds to the 24.01 container release on -[NVIDIA GPU Cloud (NGC)](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tritonserver).** - # vLLM Backend The Triton backend for [vLLM](https://github.com/vllm-project/vllm) @@ -81,7 +75,14 @@ script. A sample command to build a Triton Server container with all options enabled is shown below. Feel free to customize flags according to your needs. +Please use [NGC registry](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tritonserver/tags) +to get the latestt version of the Triton container, which corresponds to the +latest YY.MM (year.month) of [Triton release](https://github.com/triton-inference-server/server/releases). + + ``` +# YY.MM is the version of Triton. +TRITON_CONTAINER_VERSION= ./build.py -v --enable-logging --enable-stats --enable-tracing @@ -96,9 +97,9 @@ A sample command to build a Triton Server container with all options enabled is --endpoint=grpc --endpoint=sagemaker --endpoint=vertex-ai - --upstream-container-version=24.01 - --backend=python:r24.01 - --backend=vllm:r24.01 + --upstream-container-version=${TRITON_CONTAINER_VERSION} + --backend=python:r${TRITON_CONTAINER_VERSION} + --backend=vllm:r${TRITON_CONTAINER_VERSION} ``` ### Option 3. Add the vLLM Backend to the Default Triton Container From 7734a1a4d4b39c4b929fb51737d3dcc1b1bba45d Mon Sep 17 00:00:00 2001 From: Misha Chornyi <99709299+mc-nv@users.noreply.github.com> Date: Thu, 29 Feb 2024 17:46:42 -0800 Subject: [PATCH 2/3] Update README.md Co-authored-by: Olga Andreeva <124622579+oandreeva-nv@users.noreply.github.com> --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 1ce83f89..287ab34e 100644 --- a/README.md +++ b/README.md @@ -76,7 +76,7 @@ script. A sample command to build a Triton Server container with all options enabled is shown below. Feel free to customize flags according to your needs. Please use [NGC registry](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tritonserver/tags) -to get the latestt version of the Triton container, which corresponds to the +to get the latest version of the Triton vLLM container, which corresponds to the latest YY.MM (year.month) of [Triton release](https://github.com/triton-inference-server/server/releases). From 1e49070a3e618884f33bc2f62b5757044ddc5a93 Mon Sep 17 00:00:00 2001 From: Misha Chornyi <99709299+mc-nv@users.noreply.github.com> Date: Thu, 29 Feb 2024 17:46:53 -0800 Subject: [PATCH 3/3] Update README.md Co-authored-by: Olga Andreeva <124622579+oandreeva-nv@users.noreply.github.com> --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 287ab34e..13953f58 100644 --- a/README.md +++ b/README.md @@ -82,7 +82,7 @@ latest YY.MM (year.month) of [Triton release](https://github.com/triton-inferenc ``` # YY.MM is the version of Triton. -TRITON_CONTAINER_VERSION= +export TRITON_CONTAINER_VERSION= ./build.py -v --enable-logging --enable-stats --enable-tracing