-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Piotrm api server #30
Conversation
…ence-server/triton-distributed into nnshah1-hello-world
…ence-server/triton-distributed into nnshah1-hello-world
app = FastAPI() | ||
create_app(connector, app) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic will need to change a little, as the T2 OpenAI Frontend defines and wraps a FastAPI app in the FastApiFrontend object. It uses a similar helper under the hood, but this encapsulates the definitions and schemas as well.
def start_server(self): | ||
""" | ||
Launch uvicorn in a background thread or so | ||
""" | ||
config = uvicorn.Config(self.app, host="0.0.0.0", port=8080, log_level="info") | ||
self.server = uvicorn.Server(config) | ||
self._logger.info("Starting uvicorn server for openai endpoints.") | ||
self.server.run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should do something like this instead to use the unified OpenAI semantics - see related comment here: https://github.com/triton-inference-server/triton-distributed/pull/30/files#r1920609943
frontend = FastApiFrontend()
frontend.start()
You can run deploy for API server:
The standard output log doesn't indicate much:
The example produces many logs is subfolder related to location of source code:
The logs again don't show much except
The operator binding code: # define all your worker configs as before: encoder, decoder, etc.
api_server_op = OperatorConfig(
name="api_server",
implementation="ApiServerOperator", # matches the .py file's operator class
max_inflight_requests=1,
)
api_server = WorkerConfig(operators=[api_server_op], name="api_server")
deployment = Deployment(
[
(api_server, 1),
],
initialize_request_plane=True,
log_dir=args.log_dir,
log_level=args.log_level,
) What is wrong here? |
You need to modify the PYTHONPATH to point to the correct directories:
Failure:
|
Different approach was implemented in this PR: #46 |
No description provided.