Extend endpointOai.ts to allow usage of extra sampling parameters #1032

taeminlee · 2024-04-19T07:28:23Z

Description

This pull request introduces the ability to utilize the full range of sampling parameters when using an OpenAI-compatible API (e.g. VLLM). OpenAI's API currently supports a limited set of sampling parameters, which restricts the full capabilities of the open LLM model. By adding additional parameters such as best_of and top_k in the request body, we can leverage open LLM's enhanced inference performance.

For instance, VLLM supports several advanced sampling parameters that can be utilized to enhance model performance.
Detailed descriptions and usage instructions for these parameters can be found in thethe official documentation: VLLM Extra Parameters.

Code Changes

The modifications include an update to the existing JavaScript library for OpenAI where the model configuration now accepts an extra_body object. This object is then merged with the standard body parameters during the API call.
The reason for naming it extra_body is because this variable name is used in Python's OpenAI library.

Below is an example of how to configure these parameters in the model:

"parameters": {
  "temperature": 0.4,
  "max_new_tokens": 1024,
  "stop": [],
  "extra_body": {
    "top_k": 50
  }
},
"endpoints": [{
  "type": "openai",
  "completion": "completions",
  "baseURL": "http://localhost:51086/v1"
}]

Impact

This enhancement allows users of the OpenAI-compatible server to fully exploit the model's capabilities, improving the quality and flexibility of generated responses.

…n calling vllm as an OpenAI compatible

taeminlee · 2024-04-23T06:29:50Z

@nsarrazin I apologize for forgetting the lint check. I've run Prettier and confirmed that it passes the run lint locally. Please review. :)

taeminlee · 2024-04-25T03:59:14Z

@nsarrazin I've run npm run check and fixed the type warnings. Apologies for not catching all the issues in one go. Please let me know if there's anything else that needs attention. Thank you.

nsarrazin · 2024-05-05T16:59:58Z

Hey! Thanks for the contrib, sorry for the review delay, finally took a look at it. 😅

I looked at the python client you linked to
, looks like extra_body is next to extra_headers or extra_query in there.

So I thought we could move it in chat-ui from parameters.extra_body in order to have it next to defaultHeaders and defaultQuery ? (should probably rename these...)

I pushed a commit that does this and simplifies the code a bit too. I'm not super familiar with the open AI client, is it fine to pass the extended body under the options like this?

Let me know if this commit works for you, otherwise we can change things 😄

taeminlee · 2024-05-07T09:28:45Z

thank you for your response.

Firstly, regarding the OpenAI node library, the data type of the body is defined through interfaces such as ChatCompletionCreateParamsBase.

Reference: GitHub - OpenAI Node Library

create(
    body: ChatCompletionCreateParamsStreaming,
    options?: Core.RequestOptions,
  ): APIPromise<Stream<ChatCompletionChunk>>;

These interfaces define permissible hyperparameters. Thus, if parameters not permitted by the interface, such as top_k, are added to the body, they are omitted.

Reference: GitHub - OpenAI Node Library

Conclusively, the current simplified code will result in additional parameters (e.g. top-k) being omitted. To prevent parameters from being omitted, I utilized the options. Since the body specified in the options is not managed by the interface, it allows for additional parameters to be included.

This approach resulted in a somewhat complex code. I wished to simplify it further, but due to the reasons mentioned, it was challenging to make it simpler.

I am curious if it is possible to merge using the code before simplification.

nsarrazin · 2024-05-07T20:03:59Z

Hi, trying to understand the difference between the two versions here

You used to do

const combinedBody = {...body, ...parameters.extra_body};
openChatAICompletion = await openai.chat.completions.create(body, {body: combinedBody});

In my commit I switched to

const openChatAICompletion = await openai.chat.completions.create(body, {
  body: { ...body, ...extraBody },
});

which should be equivalent? I am passing the full body + extraBody inside of options. 👀

Maybe I missed something? let me know if that's the case and I can fix/revert

taeminlee · 2024-05-08T10:09:02Z

Ah, you are indeed correct. I misunderstood your code initially. It's exactly the simplification I was looking for. You are a genius. I apologize for the confusion I caused.

I have tested the code you committed, and made one additional amendment to it.

For example, in vllm, the extraBody can contain various data types. (Reference: https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#extra-parameters-for-completions-api) The previous extraBody could only accept strings, which I found to be somewhat restrictive. I think it might be okay to relax it to any type.

Looking forward to your response. Thank you.

nsarrazin · 2024-05-08T12:33:08Z

No worries!😊 Looks good to me, thanks for contributing!

jackzhouusa · 2024-05-09T20:35:11Z

How to send the extra parameters using extraBody?

For example:
`const endpoint = await model.getEndpoint();

for await (const output of await endpoint({
messages: processedMessages,
preprompt,
continueMessage: isContinue,
generateSettings: assistant?.generateSettings,
extraBody: {convId: convId.toString()}
})) {...}`

Any other ways?

taeminlee · 2024-05-10T09:56:11Z

How to send the extra parameters using extraBody?

For example: `const endpoint = await model.getEndpoint();

for await (const output of await endpoint({ messages: processedMessages, preprompt, continueMessage: isContinue, generateSettings: assistant?.generateSettings, extraBody: {convId: convId.toString()} })) {...}`

Any other ways?

I've configure the additional parameters for API endpoints similarly to the setup outlined in the README.md, utilizing the .env.local file. This method allows us to manage endpoint configurations centrally and include any necessary additional parameters directly within the extraBody.

For instance, if we need to add top_k as an additional parameter, we can easily set it in the extraBody of the endpoint configuration as follows:

MODELS=[
  {
    "name": "vllm",
    "id": "vllm",
    "parameters": {
      "temperature": 0.9,
      "top_p": 0.95,
      "max_new_tokens": 1024,
    },
    "endpoints": [{
      "type": "openai",
      "baseURL": "http://localhost:8000/v1",
      "extraBody": {"top_k": 50}
    }]
  }
]

This setup ensures that our API can receive and handle these parameters effectively, enhancing the model's functionality as specified.

jackzhouusa · 2024-05-11T01:30:51Z

I see. Thank you for your explanation.
For my use case, I need to send the convId to endpoint so I utilized extraBody for a workaround. It works fine. Thanks for your contribution.

Fixed incorrect setup for extra parameters in OpenAI compatible server configuration (see PR huggingface#1032)

Update README.md Fixed incorrect setup for extra parameters in OpenAI compatible server configuration (see PR #1032)

…ggingface#1032) * Extend endpointOai.ts to allow usage of extra sampling parameters when calling vllm as an OpenAI compatible * refactor : prettier endpointOai.ts * Fix: Corrected type imports in endpointOai.ts * Simplifies code a bit and adds `extraBody` to open ai endpooint * Update zod schema to allow any type in extraBody --------- Co-authored-by: Nathan Sarrazin <[email protected]>

…ngface#1141) Update README.md Fixed incorrect setup for extra parameters in OpenAI compatible server configuration (see PR huggingface#1032)

Extend endpointOai.ts to allow usage of extra sampling parameters whe…

417f4da

…n calling vllm as an OpenAI compatible

nsarrazin self-requested a review April 22, 2024 07:21

refactor : prettier endpointOai.ts

d0699ce

Fix: Corrected type imports in endpointOai.ts

344914c

taeminlee and others added 3 commits May 2, 2024 16:29

Merge branch 'main' into main

01a1443

Merge branch 'main' into main

5190791

Simplifies code a bit and adds extraBody to open ai endpooint

43329c3

nsarrazin approved these changes May 6, 2024

View reviewed changes

Update zod schema to allow any type in extraBody

9c48d15

Merge branch 'main' into main

b1040a8

nsarrazin approved these changes May 8, 2024

View reviewed changes

nsarrazin merged commit 25d6df8 into huggingface:main May 8, 2024
3 checks passed

taeminlee added a commit to taeminlee/chat-ui that referenced this pull request May 15, 2024

Update README.md

7466290

Fixed incorrect setup for extra parameters in OpenAI compatible server configuration (see PR huggingface#1032)

taeminlee mentioned this pull request May 15, 2024

Update documentation of OpenAI compatible server configuration #1141

Merged

nsarrazin pushed a commit that referenced this pull request May 27, 2024

Update documentation of OpenAI compatible server configuration (#1141)

1cfde5b

Update README.md Fixed incorrect setup for extra parameters in OpenAI compatible server configuration (see PR #1032)

nsarrazin added the enhancement New feature or request label May 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend endpointOai.ts to allow usage of extra sampling parameters #1032

Extend endpointOai.ts to allow usage of extra sampling parameters #1032

taeminlee commented Apr 19, 2024

taeminlee commented Apr 23, 2024

taeminlee commented Apr 25, 2024 •

edited

Loading

nsarrazin commented May 5, 2024

taeminlee commented May 7, 2024 •

edited

Loading

nsarrazin commented May 7, 2024

taeminlee commented May 8, 2024 •

edited

Loading

nsarrazin commented May 8, 2024

jackzhouusa commented May 9, 2024 •

edited

Loading

taeminlee commented May 10, 2024 •

edited

Loading

jackzhouusa commented May 11, 2024

Extend endpointOai.ts to allow usage of extra sampling parameters #1032

Extend endpointOai.ts to allow usage of extra sampling parameters #1032

Conversation

taeminlee commented Apr 19, 2024

Description

Code Changes

Impact

taeminlee commented Apr 23, 2024

taeminlee commented Apr 25, 2024 • edited Loading

nsarrazin commented May 5, 2024

taeminlee commented May 7, 2024 • edited Loading

nsarrazin commented May 7, 2024

taeminlee commented May 8, 2024 • edited Loading

nsarrazin commented May 8, 2024

jackzhouusa commented May 9, 2024 • edited Loading

taeminlee commented May 10, 2024 • edited Loading

jackzhouusa commented May 11, 2024

taeminlee commented Apr 25, 2024 •

edited

Loading

taeminlee commented May 7, 2024 •

edited

Loading

taeminlee commented May 8, 2024 •

edited

Loading

jackzhouusa commented May 9, 2024 •

edited

Loading

taeminlee commented May 10, 2024 •

edited

Loading