fix: Fix L0_implicit_state and it's variants #7940

indrajit96 · 2025-01-15T07:28:01Z

What does the PR do?

Update the config file with max_sequence_idle_microseconds to avoid sequence timeouts in server

Checklist

PR title reflects the change and is of format <commit_type>: <Title>
Changes are described in the pull request.
[] Related issues are referenced.
[] Populated github labels field
Added test plan and verified test passes.
Verified that the PR passes existing CI.
Verified copyright is correct on all changed files.
Added succinct git squash message before merging ref.
All template sections are filled out.
[] Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

Related PRs:

NA

Where should the reviewer start?

Config file

Test plan:

NA

CI Pipeline ID:
https://gitlab-master.nvidia.com/dl/dgx/tritonserver/-/pipelines/22507783

Caveats:

Background

Server waits for max_sequence_idle_microseconds before closing the request sequence and expects the client to start a NEW sequence for ANY subsequent request.
Test was failing because the max_sequence_idle_microseconds was set to low causing the server to believe the sequence is over and reject any further requests with SAME sequence_id without the START flag. Causing the test to fail.

* Modify "header_forward_pattern" to match headers case-insensitively. Add unit tests. * fix indentation * fix pre-comiit errors * Update doc * Update copyright * Add test case for "(?-i)", which disables regex case-insensitive mode. * fix pre-commit * Name each test. Remove support of disabling --http-header-forward-pattern case-insensitive mode on http python client. * Update .md file. * fix typo * Reformat args. * Fix pre-commit * Fix test name issue. * Fix pre-commit. * Update md file and copyright.

* Update README and versions for 2.43.0 / 24.02 * Update Dockefile to reduce image size. * Update path in patch file for model generation Update README.md post-24.02

* patching git repository parameterization from production branch 1 * Fix go package directory name * pre-commit fixes * pre-commit fixes --------- Co-authored-by: kyle <[email protected]>

* Enhance bound check for shm offset * Add test for enhance bound check for shm offset * Fix off by 1 on max offset * Improve comments * Improve comment and offset * Separate logic between computation and validation

…6017) * Allow non-decoupled model to send response and FINAL flag separately * Update copyright * Defer sending error until FINAL flag is seen to avoid invalid reference * Move timestamp capture location * Delay time-point of response complete timestamp in GPRC and SageMaker endpoint * Move location of RESPONSE_COMPLETE timestamp capture to better align with the meaning.

Added a test case to check for optional/required input params in a request and appropriate response from server. Includes addition of 3 simple models with a combination of required/optional input params

Add flag to enable compile of OpenAI support in PA

* Test Correlation Id string support for BLS

* Add AsyncIO HTTP compression test * Improve command line option handling

* Update Docerkfile to install genai * Change the installation script * install both build and hatch * Update name --------- Co-authored-by: Elias Bermudez <[email protected]>

* Added TRITONSERVER_InferenceTraceSetContext logic

…odes (#6992) * Add documentation for mapping between Triton Errors and HTTP status codes * formatting * Update README.md

* Update README and versions for 2.44.0 / 24.03 (#6971) * Update README and versions for 2.44.0 / 24.03 * Mchornyi 24.03 (#6972) * Current location is dropped in 12.4 * Update Dockerfile.win10.min * Change to triton_sample_folder (#6973) --------- Co-authored-by: kyle <[email protected]> Co-authored-by: Misha Chornyi <[email protected]> * Specify path for PyTorch model extension library (#7025) * Update README.md 2.44.0 / 24.03 (#7032) * Update README.md post-24.03 --------- Co-authored-by: Kyle McGill <[email protected]> Co-authored-by: kyle <[email protected]>

* Fix Otel version * Fix version in CPU metrics * Update metrics.md * Update trace.md

…and code (#7067)

* Add testing for iterative scheduler backlogged requests * Update test count

* Remove conda from build * Escape slash symbol * Escape slash: align * Escape slash: align * Escape slash: align * Escape slash: align * Install virtualenv * Fix vLLM flag * remove conda flag * Fix code style

Co-authored-by: Ryan McCormick <[email protected]> Co-authored-by: Kyle McGill <[email protected]> Co-authored-by: Suman Tatiraju <[email protected]> Co-authored-by: Meenakshi Sharma <[email protected]> Co-authored-by: Suman Tatiraju <[email protected]>

Update max_sequence_idle_microseconds in the config Update max_sequence_idle_microseconds in the config Remove debug logs Update copyright

rmccorm4

LGTM, just copyright nits, thanks for the fix!

Also, please expand on the bug and the fix more in the description to educate future readers.

Ex: if sequence times out mid sequence, it will interpret a request as the start of a new sequence, but client won't have set the START flag because it thinks it's still in the middle of a sequence, so an error will be raised -- but in nicer words

rmccorm4 · 2025-01-15T07:30:49Z

qa/L0_implicit_state/test.sh

@@ -1,5 +1,5 @@
 #!/bin/bash
-# Copyright 2021-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# Copyright 2021-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.


Don't update copyright if file isn't modified

rmccorm4 · 2025-01-15T07:31:05Z

qa/L0_implicit_state/models/growable_memory/config.pbtxt

@@ -1,4 +1,4 @@
-# Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.


qa/L0_implicit_state/models/growable_memory/config.pbtxt

yinggeh · 2025-01-15T17:35:31Z

Please clean up the commit history.

rmccorm4 · 2025-01-15T18:39:14Z

The original change was so small, I'd probably just make a fresh branch for simplicity. I think that force push from Pavithra messed up the history.

indrajit96 · 2025-01-15T18:40:21Z

The original change was so small, I'd probably just make a fresh branch for simplicity. I think that force push from Pavithra messed up the history.

Yup :)

yinggeh and others added 30 commits February 27, 2024 12:00

Add note on --cache-config spacing and fix typos (#6929)

551978b

Remove ignore files that are not in use by repository (#6893)

246f46c

Update README and versions for 2.43.0 / 24.02 (#6886)

1dcf2cf

* Update README and versions for 2.43.0 / 24.02 * Update Dockefile to reduce image size. * Update path in patch file for model generation Update README.md post-24.02

Set ONNX Runtime version 1.17.2

9be77f1

Expose tritonserver args in values.yaml (#5582)

19b02a2

Parameterize git repository (#6934)

d0f332b

* patching git repository parameterization from production branch 1 * Fix go package directory name * pre-commit fixes * pre-commit fixes --------- Co-authored-by: kyle <[email protected]>

Enhance bound check for shm offset (#6914)

c2299d5

* Enhance bound check for shm offset * Add test for enhance bound check for shm offset * Fix off by 1 on max offset * Improve comments * Improve comment and offset * Separate logic between computation and validation

Add test for max queue delay timeout prompt response (#6938)

25266a5

Test improved input validation errors (#6933)

b012bd0

Added a test case to check for optional/required input params in a request and appropriate response from server. Includes addition of 3 simple models with a combination of required/optional input params

Update Dockerfile.sdk with OpenAI support (#6941)

52a1cd2

Add flag to enable compile of OpenAI support in PA

Test Correlation Id string support for BLS (#6963)

b2e6e7e

* Test Correlation Id string support for BLS

Update 'main' to track development of 2.45.0 / 24.04 (#6974)

9786e40

Add AsyncIO HTTP compression test (#6975)

e92abf2

* Add AsyncIO HTTP compression test * Improve command line option handling

Install genai-pa into SDK container (#6942)

8139431

* Update Docerkfile to install genai * Change the installation script * install both build and hatch * Update name --------- Co-authored-by: Elias Bermudez <[email protected]>

extend existing tests with more parameters (#6951)

5c6e487

Exposing trace context to python backend (#6985)

9f16eef

* Added TRITONSERVER_InferenceTraceSetContext logic

Add documentation for mapping between Triton Errors and HTTP status c…

8b36aa8

…odes (#6992) * Add documentation for mapping between Triton Errors and HTTP status codes * formatting * Update README.md

Remove hatch version (#7009)

afaa6f4

Update vLLM to 0.3.2 for gemma support (#6918)

fdbfb27

Add missing copyright for L0_trace (#6996)

2be127b

fix sphinx warnings (#7030)

df753d7

Add meetup invite banner (#7049)

a844eda

Fix incorrect version updates (#7073)

1dfa33d

* Fix Otel version * Fix version in CPU metrics * Update metrics.md * Update trace.md

Update compose.py and remove mention of tensorflow1 in documentation …

879a505

…and code (#7067)

Add testing for iterative scheduler backlogged requests (#7059)

e9e3648

* Add testing for iterative scheduler backlogged requests * Update test count

Remove conda package manager (#7069)

dbeb198

* Remove conda from build * Escape slash symbol * Escape slash: align * Escape slash: align * Escape slash: align * Escape slash: align * Install virtualenv * Fix vLLM flag * remove conda flag * Fix code style

fix link (#7044)

e1d58c7

mc-nv and others added 17 commits December 30, 2024 12:35

Add environment variable to compose.py (#7909)

50cfd6e

Switch to docker volumes in model generation (#7910)

8a5cc8f

ci: Fix OpenVINO models (#7904)

1e4d838

fix: Fix scalar model generation for L0_scalar_io (#7920)

827078e

Fixing typo in script (#7923)

02fafea

test: Validate request correlation ID data type (#7919)

958636d

build: Extend TRT Plugin Handling to Support Windows (#7924)

bfc7f1f

fix: Fix package placeholder file name (#7926)

d3ff71a

fix: Fix copyrights for new files added to documentation (#7921)

65ef9c8

ci: Fix error-masking bug and improve debugability in L0_trace (#7930)

6100d7f

ci: Stabilize L0_pinned_memory flakiness (#7929)

2af0c22

Add More logs

1848338

Install python build package inside Winbase build container (#7934)

9199c76

ci: Fix L0_lifecycle server shutdown (#7933)

3471ed7

Update config with max_sequence_idle_microseconds

7f2df2d

Update max_sequence_idle_microseconds in the config Update max_sequence_idle_microseconds in the config Remove debug logs Update copyright

Update copyright

c672803

indrajit96 requested review from rmccorm4 and yinggeh January 15, 2025 07:28

rmccorm4 reviewed Jan 15, 2025

View reviewed changes

Fix PR comments

4a3d089

rmccorm4 reviewed Jan 15, 2025

View reviewed changes

qa/L0_implicit_state/models/growable_memory/config.pbtxt Outdated Show resolved Hide resolved

indrajit96 and others added 3 commits January 14, 2025 23:48

Pre-Commit Copyright

cc5511c

Fix pre-commit

38ecc8f

Merge branch 'main' into ibhosale_L0_implicit_state

a0d1b22

pvijayakrish force-pushed the ibhosale_L0_implicit_state branch from 0cd8a05 to a0d1b22 Compare January 15, 2025 17:13

indrajit96 closed this Jan 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Fix L0_implicit_state and it's variants #7940

fix: Fix L0_implicit_state and it's variants #7940

indrajit96 commented Jan 15, 2025 •

edited

Loading

rmccorm4 left a comment

rmccorm4 Jan 15, 2025

rmccorm4 Jan 15, 2025

yinggeh commented Jan 15, 2025

rmccorm4 commented Jan 15, 2025

indrajit96 commented Jan 15, 2025

		@@ -1,4 +1,4 @@
		# Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
		# Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

fix: Fix L0_implicit_state and it's variants #7940

fix: Fix L0_implicit_state and it's variants #7940

Conversation

indrajit96 commented Jan 15, 2025 • edited Loading

What does the PR do?

Checklist

Commit Type:

Related PRs:

Where should the reviewer start?

Test plan:

Caveats:

Background

rmccorm4 left a comment

Choose a reason for hiding this comment

rmccorm4 Jan 15, 2025

Choose a reason for hiding this comment

rmccorm4 Jan 15, 2025

Choose a reason for hiding this comment

yinggeh commented Jan 15, 2025

rmccorm4 commented Jan 15, 2025

indrajit96 commented Jan 15, 2025

indrajit96 commented Jan 15, 2025 •

edited

Loading