`\ready` API for Kubernetes probe to know when TorchServe backend is ready to receive traffic #3047

agunapal · 2024-03-27T19:13:27Z

🚀 The feature

This feature would add an API so that Kubernetes probe can be used to know when to start sending traffic.

/ready will return 200 when all the models specified in config.properties has at least 1 backend worker ready to receive traffic.

This would make it simpler for customers to use TorchServe in a kubernetes deployment with multi model endpoints scenario

Motivation, pitch

For Multi-Model-Endpoint Use-case with Kubernetes, consider config.properties has the following models

models={\
  "noop": {\
    "1.0": {\
        "defaultVersion": true,\
        "marName": "noop.mar",\
        "minWorkers": 1,\
        "maxWorkers": 1,\
        "batchSize": 4,\
        "maxBatchDelay": 100,\
        "responseTimeout": 120\
    }\
  },\
  "vgg16": {\
    "1.0": {\
        "defaultVersion": true,\
        "marName": "vgg16.mar",\
        "minWorkers": 1,\
        "maxWorkers": 4,\
        "batchSize": 8,\
        "maxBatchDelay": 100,\
        "responseTimeout": 120\
    }\
  }\
}

Today, one can use the /ping API to know when TorchServe is up. But this is for the frontend only. Workers with multiple models will take additional time to come up.

Alternatives

One can write a script to use the /describe API for each of the models to track when each have at least 1 backend worker and then declare the pod ready.

Additional context

No response

The text was updated successfully, but these errors were encountered:

lxning · 2024-03-27T20:09:59Z

This PR is working on adding health APIs for model server and model level.

agunapal added the kubernetes label Mar 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`\ready` API for Kubernetes probe to know when TorchServe backend is ready to receive traffic #3047

`\ready` API for Kubernetes probe to know when TorchServe backend is ready to receive traffic #3047

agunapal commented Mar 27, 2024

lxning commented Mar 27, 2024

\ready API for Kubernetes probe to know when TorchServe backend is ready to receive traffic #3047

\ready API for Kubernetes probe to know when TorchServe backend is ready to receive traffic #3047

Comments

agunapal commented Mar 27, 2024

🚀 The feature

Motivation, pitch

Alternatives

Additional context

lxning commented Mar 27, 2024

`\ready` API for Kubernetes probe to know when TorchServe backend is ready to receive traffic #3047

`\ready` API for Kubernetes probe to know when TorchServe backend is ready to receive traffic #3047