Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API details #560

Closed
1 task done
LiangA opened this issue Feb 5, 2024 · 6 comments
Closed
1 task done

API details #560

LiangA opened this issue Feb 5, 2024 · 6 comments
Assignees
Labels
question Further information is requested

Comments

@LiangA
Copy link

LiangA commented Feb 5, 2024

Search before asking

Question

I would like to create an inference API, caring about the accessibility, reliability security, and, of course, cost. But I'm not seeing any detail information except that 10,000 calls/month in HUB Pro plan.

I'm also curious about what tools do I have? (I'm an AWS user, so terms like S3, EC2, IAM or VPC can help me understand)
If it's possible, I would like to know about the service level too.

Thanks!

Additional

No response

@LiangA LiangA added the question Further information is requested label Feb 5, 2024
Copy link

github-actions bot commented Feb 5, 2024

👋 Hello @LiangA, thank you for raising an issue about Ultralytics HUB 🚀! Please visit our HUB Docs to learn more:

  • Quickstart. Start training and deploying YOLO models with HUB in seconds.
  • Datasets: Preparing and Uploading. Learn how to prepare and upload your datasets to HUB in YOLO format.
  • Projects: Creating and Managing. Group your models into projects for improved organization.
  • Models: Training and Exporting. Train YOLOv5 and YOLOv8 models on your custom datasets and export them to various formats for deployment.
  • Integrations. Explore different integration options for your trained models, such as TensorFlow, ONNX, OpenVINO, CoreML, and PaddlePaddle.
  • Ultralytics HUB App. Learn about the Ultralytics App for iOS and Android, which allows you to run models directly on your mobile device.
    • iOS. Learn about YOLO CoreML models accelerated on Apple's Neural Engine on iPhones and iPads.
    • Android. Explore TFLite acceleration on mobile devices.
  • Inference API. Understand how to use the Inference API for running your trained models in the cloud to generate predictions.

If this is a 🐛 Bug Report, please provide screenshots and steps to reproduce your problem to help us get started working on a fix.

If this is a ❓ Question, please provide as much information as possible, including dataset, model, environment details etc. so that we might provide the most helpful response.

We try to respond to all issues as promptly as possible. Thank you for your patience!

@kalenmike
Copy link
Contributor

@LiangA I am not sure I am understanding your question, are you asking about implementing your own custom API or using the API provided by HUB?

We currently provide a shared inference option in both the free and pro plans with defined rate limits. They do not include any additional customization.

We have plans to introduce a dedicated inference option that will allow some more customization.

@kalenmike kalenmike self-assigned this Feb 5, 2024
@LiangA
Copy link
Author

LiangA commented Feb 5, 2024

Hi Kalenmike,
Thanks for quick replying!
Let me describe my need: I'm looking for a way to upload a trained model, and open up APIs to myself. So it will be more like implementing my own custom API.

Looks like Hub doesn't provide this service now, am I understand you correctly?

@kalenmike
Copy link
Contributor

kalenmike commented Feb 5, 2024

@LiangA You have access to the shared HUB API at the moment. All of your models have an endpoint that will return inference results. Its important to understand that as these are shared they are also stateless so its more of a solution for previewing and testing than for production. The shared inference API has higher latency as your model needs to be fetched and loaded on each request, this results in slower responses.

The dedicated inference API which is still in the development pipeline deploys a scalable API that is preloaded with your model. This has lower latency as your model is ready to go on each request.

We are finishing up our Cloud Training feature and it is being prioritized over the dedicated inference API. So the development work is backlogged for the moment.

These are all cloud solution where Ultralytics handles the hardware and software implementations for you. You can of course implement this all yourself using the ultralytics python package and your own hardware/software solution.

You can demo the shared API using a free HUB account by following these steps:

  1. Train a HUB model
  2. Open the deploy tab for your trained model https://hub.ultralytics.com/models/<model_id>?tab=deploy
  3. Copy your endpoint and view the implementation example if needed.

Free accounts have the following rate limits applied:
1/second, 100/hour, 1000/month

@LiangA
Copy link
Author

LiangA commented Feb 5, 2024

Thanks for making this crystal clear. I'll keep an eye on the updates.

@UltralyticsAssistant
Copy link
Member

You're welcome, @LiangA! If you have any more questions or need further assistance in the future, don't hesitate to reach out. Happy to help, and we appreciate your interest in Ultralytics HUB! Keep an eye on our updates for new features. 😊 Have a great day!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants