[Question] Help with Client-Side Batching for Large Requests in Triton #818

harsh-boloai · 2025-01-17T06:08:18Z

I’m currently facing an issue with handling requests where the request batch size is greater than the model’s max_batch_size hosted in Triton. The chunking guide for PyTriton suggests it’s possible to address this, but I’m not sure how to implement it using triton client.

Related Open Issues

Installing pytriton includes Triton binaries, which I don’t need for client-side operations. I found this issue where others have mentioned the lack of a lightweight pytriton.client package. Any updates on this?
There’s an ongoing discussion in Triton server issue #4547 about handling large requests, but there haven’t been updates there either.

Questions

How can I handle requests where the batch size exceeds the model’s max_batch_size? Specifically, I’d like to know how to split these large requests efficiently and send them to Triton in smaller batches.
Could you provide a minimal working example using TritonClient?
- I’ve seen the PyTriton example, which includes asynchronous support, but I’m looking for something similar with TritonClient.
- If possible, an example using concurrent.futures or async functionality would be very helpful.
Is there a plan to release a standalone pytriton.client package to avoid installing the full pytriton? Alternatively, is there a plan to include this batch splitting logic in Triton server itself?

Thanks in advance!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Help with Client-Side Batching for Large Requests in Triton #818

[Question] Help with Client-Side Batching for Large Requests in Triton #818

harsh-boloai commented Jan 17, 2025

[Question] Help with Client-Side Batching for Large Requests in Triton #818

[Question] Help with Client-Side Batching for Large Requests in Triton #818

Comments

harsh-boloai commented Jan 17, 2025

Related Open Issues

Questions