Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] Route your openai compatible sdk base_url to the Guardrails Server #1212

Open
joy1007 opened this issue Jan 14, 2025 · 2 comments
Open
Labels
documentation Improvements or additions to documentation

Comments

@joy1007
Copy link

joy1007 commented Jan 14, 2025

Description
In the documentation "Quickstart : Guardrails Server", have mention about updating client to use the Guardrails Server
I wonder how exactly can I route my openai (or openai compatible sdk) base_url to the http://localhost:8000/guards/[guard_name]/openai/v1/ endpoint.

Current documentation
螢幕擷取畫面 2025-01-14 091134

Additional context
How can I route to Guardrails Server when the openAI client's base_url is already in use by Guardrails Server?
For example, assume my openai endpoint is http://my-openai.endpoint.com/v1
and the client's base_url has already set below
client = OpenAI(
base_url="http://localhost:8000/guards/gibberish_guard/openai/v1",
api_key="token_abc"
)
when I want to use my model, it doesn't work by typing my model name like this.
response = client.chat.completions.create(
model="my-model-name",
messages=[{
"role": "user",
"content": "Make up some gibberish for me please!"
}]
)

@joy1007 joy1007 added the documentation Improvements or additions to documentation label Jan 14, 2025
@dtam
Copy link
Contributor

dtam commented Jan 14, 2025

@joy1007 can i get some more information here? what's the error you're getting? as far as i can tell this should be supported. does the non openai endpoint respond? it should be a POST http://localhost:8000/guards/gibberish_guard/validate what does your config.py look like? what happens if you curl it or try to use another model?

@joy1007
Copy link
Author

joy1007 commented Jan 15, 2025

Thank you for your response!

We deployed a LLaMA model using vllm and configured it to connect via the OpenAI-compatible endpoint. Typically, we make requests as follows:

client = OpenAI(
    base_url="http://my-openai.endpoint.com/v1",
    api_key="my-api-key",
    http_client=httpx.Client(verify=False)  # Ignore SSL verification
)

However, since we wanted to use the Guardrails server, we modified the script as follows (the model data is anonymized for privacy):

from openai import OpenAI
import httpx

client = OpenAI(
    base_url="http://my-openai.endpoint.com/v1",
    api_key="my-api-key",
    http_client=httpx.Client(verify=False)
)
 
guardrails_client = OpenAI(
    base_url="http://localhost:8000/guards/detect_pii/openai/v1",
    api_key="token_abc"
)
 
guardrails_response = guardrails_client.chat.completions.create(
    model="llama-3.1-8b-instruct",
    messages=[{
        "role": "user", 
        "content": "Hello, who are you?"
    }]
)
 
print(guardrails_response.choices[0].message.content)
print(guardrails_response.guardrails['validation_passed'])

Here is the error we encountered when running the above Python script:

Traceback (most recent call last):
  File "/mnt/d/guard_server.py", line 16, in <module>
    guardrails_response = guardrails_client.chat.completions.create(
...
openai.InternalServerError: Error code: 500 - {'detail': 'Internal Server Error'}

Here is the config.py file we used:

from guardrails import Guard
from guardrails.hub import DetectPII
guard = Guard()
guard.name = 'detect_pii'
print("GUARD PARAMETERS UNFILLED! UPDATE THIS FILE!")  # TODO: Remove this when parameters are filled.
guard.use(DetectPII())  # TODO: Add parameters.

Additionally, the server-side error log reports:

ERROR:guardrails-api:The callable `fn` passed to `Guard(fn, ...)` failed with the following error: `litellm.BadRequestError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call. You passed model=llama-3.1-8b-instruct
Pass model as E.g. For 'Huggingface' inference endpoints pass in `completion(model='huggingface/starcoder',..)` Learn more: https://docs.litellm.ai/docs/providers`. Make sure that `fn` can be called as a function
that accepts a prompt string, **kwargs, and returns a string.
If you're using a custom LLM callable, please see docs
here: https://go.guardrailsai.com/B1igEy3
...
litellm.exceptions.BadRequestError: litellm.BadRequestError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call. You passed model=llama-3.1-8b-instruct
Pass model as E.g. For 'Huggingface' inference endpoints pass in `completion(model='huggingface/starcoder',..)` Learn more: https://docs.litellm.ai/docs/providers

We would appreciate guidance on how to resolve this issue. Specifically:

  • How can we configure Guardrails Server to recognize and correctly interact with our custom LLaMA model endpoint?
  • What changes are necessary to properly pass the model and provider information?
  • Is there an example for setting up a custom LLM callable in this context?

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants