-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend endpointOai.ts to allow usage of extra sampling parameters #1032
Conversation
…n calling vllm as an OpenAI compatible
@nsarrazin I apologize for forgetting the lint check. I've run Prettier and confirmed that it passes the run lint locally. Please review. :) |
@nsarrazin I've run |
Hey! Thanks for the contrib, sorry for the review delay, finally took a look at it. 😅 I looked at the python client you linked to So I thought we could move it in chat-ui from I pushed a commit that does this and simplifies the code a bit too. I'm not super familiar with the open AI client, is it fine to pass the extended body under the options like this? Let me know if this commit works for you, otherwise we can change things 😄 |
thank you for your response. Firstly, regarding the OpenAI node library, the data type of the body is defined through interfaces such as Reference: GitHub - OpenAI Node Library create(
body: ChatCompletionCreateParamsStreaming,
options?: Core.RequestOptions,
): APIPromise<Stream<ChatCompletionChunk>>; These interfaces define permissible hyperparameters. Thus, if parameters not permitted by the interface, such as Reference: GitHub - OpenAI Node Library Conclusively, the current simplified code will result in additional parameters (e.g. This approach resulted in a somewhat complex code. I wished to simplify it further, but due to the reasons mentioned, it was challenging to make it simpler. I am curious if it is possible to merge using the code before simplification. |
Hi, trying to understand the difference between the two versions here You used to do
In my commit I switched to
which should be equivalent? I am passing the full body + extraBody inside of options. 👀 Maybe I missed something? let me know if that's the case and I can fix/revert |
Ah, you are indeed correct. I misunderstood your code initially. It's exactly the simplification I was looking for. You are a genius. I apologize for the confusion I caused. I have tested the code you committed, and made one additional amendment to it. For example, in vllm, the extraBody can contain various data types. (Reference: https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#extra-parameters-for-completions-api) The previous extraBody could only accept strings, which I found to be somewhat restrictive. I think it might be okay to relax it to any type. Looking forward to your response. Thank you. |
No worries!😊 Looks good to me, thanks for contributing! |
How to send the extra parameters using extraBody? For example: for await (const output of await endpoint({ Any other ways? |
I've configure the additional parameters for API endpoints similarly to the setup outlined in the README.md, utilizing the For instance, if we need to add MODELS=[
{
"name": "vllm",
"id": "vllm",
"parameters": {
"temperature": 0.9,
"top_p": 0.95,
"max_new_tokens": 1024,
},
"endpoints": [{
"type": "openai",
"baseURL": "http://localhost:8000/v1",
"extraBody": {"top_k": 50}
}]
}
] This setup ensures that our API can receive and handle these parameters effectively, enhancing the model's functionality as specified. |
I see. Thank you for your explanation. |
Fixed incorrect setup for extra parameters in OpenAI compatible server configuration (see PR huggingface#1032)
Update README.md Fixed incorrect setup for extra parameters in OpenAI compatible server configuration (see PR #1032)
…ggingface#1032) * Extend endpointOai.ts to allow usage of extra sampling parameters when calling vllm as an OpenAI compatible * refactor : prettier endpointOai.ts * Fix: Corrected type imports in endpointOai.ts * Simplifies code a bit and adds `extraBody` to open ai endpooint * Update zod schema to allow any type in extraBody --------- Co-authored-by: Nathan Sarrazin <[email protected]>
…ngface#1141) Update README.md Fixed incorrect setup for extra parameters in OpenAI compatible server configuration (see PR huggingface#1032)
Description
This pull request introduces the ability to utilize the full range of sampling parameters when using an OpenAI-compatible API (e.g. VLLM). OpenAI's API currently supports a limited set of sampling parameters, which restricts the full capabilities of the open LLM model. By adding additional parameters such as
best_of
andtop_k
in the request body, we can leverage open LLM's enhanced inference performance.For instance, VLLM supports several advanced sampling parameters that can be utilized to enhance model performance.
Detailed descriptions and usage instructions for these parameters can be found in thethe official documentation: VLLM Extra Parameters.
Code Changes
The modifications include an update to the existing JavaScript library for OpenAI where the
model
configuration now accepts anextra_body
object. This object is then merged with the standard body parameters during the API call.The reason for naming it extra_body is because this variable name is used in Python's OpenAI library.
Below is an example of how to configure these parameters in the model:
Impact
This enhancement allows users of the OpenAI-compatible server to fully exploit the model's capabilities, improving the quality and flexibility of generated responses.