Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

\ready API for Kubernetes probe to know when TorchServe backend is ready to receive traffic #3047

Open
agunapal opened this issue Mar 27, 2024 · 1 comment

Comments

@agunapal
Copy link
Collaborator

🚀 The feature

This feature would add an API so that Kubernetes probe can be used to know when to start sending traffic.

/ready will return 200 when all the models specified in config.properties has at least 1 backend worker ready to receive traffic.

This would make it simpler for customers to use TorchServe in a kubernetes deployment with multi model endpoints scenario

Motivation, pitch

For Multi-Model-Endpoint Use-case with Kubernetes, consider config.properties has the following models

models={\
  "noop": {\
    "1.0": {\
        "defaultVersion": true,\
        "marName": "noop.mar",\
        "minWorkers": 1,\
        "maxWorkers": 1,\
        "batchSize": 4,\
        "maxBatchDelay": 100,\
        "responseTimeout": 120\
    }\
  },\
  "vgg16": {\
    "1.0": {\
        "defaultVersion": true,\
        "marName": "vgg16.mar",\
        "minWorkers": 1,\
        "maxWorkers": 4,\
        "batchSize": 8,\
        "maxBatchDelay": 100,\
        "responseTimeout": 120\
    }\
  }\
}

Today, one can use the /ping API to know when TorchServe is up. But this is for the frontend only. Workers with multiple models will take additional time to come up.

Alternatives

One can write a script to use the /describe API for each of the models to track when each have at least 1 backend worker and then declare the pod ready.

Additional context

No response

@lxning
Copy link
Collaborator

lxning commented Mar 27, 2024

This PR is working on adding health APIs for model server and model level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants