[docs] Route your openai compatible sdk base_url to the Guardrails Server #1212

joy1007 · 2025-01-14T01:22:58Z

Description
In the documentation "Quickstart : Guardrails Server", have mention about updating client to use the Guardrails Server
I wonder how exactly can I route my openai (or openai compatible sdk) base_url to the http://localhost:8000/guards/[guard_name]/openai/v1/ endpoint.

Current documentation

Additional context
How can I route to Guardrails Server when the openAI client's base_url is already in use by Guardrails Server?
For example, assume my openai endpoint is http://my-openai.endpoint.com/v1
and the client's base_url has already set below
client = OpenAI(
base_url="http://localhost:8000/guards/gibberish_guard/openai/v1",
api_key="token_abc"
)
when I want to use my model, it doesn't work by typing my model name like this.
response = client.chat.completions.create(
model="my-model-name",
messages=[{
"role": "user",
"content": "Make up some gibberish for me please!"
}]
)

dtam · 2025-01-14T17:44:43Z

@joy1007 can i get some more information here? what's the error you're getting? as far as i can tell this should be supported. does the non openai endpoint respond? it should be a POST http://localhost:8000/guards/gibberish_guard/validate what does your config.py look like? what happens if you curl it or try to use another model?

joy1007 · 2025-01-15T09:21:05Z

Thank you for your response!

We deployed a LLaMA model using vllm and configured it to connect via the OpenAI-compatible endpoint. Typically, we make requests as follows:

client = OpenAI(
    base_url="http://my-openai.endpoint.com/v1",
    api_key="my-api-key",
    http_client=httpx.Client(verify=False)  # Ignore SSL verification
)

However, since we wanted to use the Guardrails server, we modified the script as follows (the model data is anonymized for privacy):

from openai import OpenAI
import httpx

client = OpenAI(
    base_url="http://my-openai.endpoint.com/v1",
    api_key="my-api-key",
    http_client=httpx.Client(verify=False)
)
 
guardrails_client = OpenAI(
    base_url="http://localhost:8000/guards/detect_pii/openai/v1",
    api_key="token_abc"
)
 
guardrails_response = guardrails_client.chat.completions.create(
    model="llama-3.1-8b-instruct",
    messages=[{
        "role": "user", 
        "content": "Hello, who are you?"
    }]
)
 
print(guardrails_response.choices[0].message.content)
print(guardrails_response.guardrails['validation_passed'])

Here is the error we encountered when running the above Python script:

Traceback (most recent call last):
  File "/mnt/d/guard_server.py", line 16, in <module>
    guardrails_response = guardrails_client.chat.completions.create(
...
openai.InternalServerError: Error code: 500 - {'detail': 'Internal Server Error'}

Here is the config.py file we used:

from guardrails import Guard
from guardrails.hub import DetectPII
guard = Guard()
guard.name = 'detect_pii'
print("GUARD PARAMETERS UNFILLED! UPDATE THIS FILE!")  # TODO: Remove this when parameters are filled.
guard.use(DetectPII())  # TODO: Add parameters.

Additionally, the server-side error log reports:

ERROR:guardrails-api:The callable `fn` passed to `Guard(fn, ...)` failed with the following error: `litellm.BadRequestError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call. You passed model=llama-3.1-8b-instruct
Pass model as E.g. For 'Huggingface' inference endpoints pass in `completion(model='huggingface/starcoder',..)` Learn more: https://docs.litellm.ai/docs/providers`. Make sure that `fn` can be called as a function
that accepts a prompt string, **kwargs, and returns a string.
If you're using a custom LLM callable, please see docs
here: https://go.guardrailsai.com/B1igEy3
...
litellm.exceptions.BadRequestError: litellm.BadRequestError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call. You passed model=llama-3.1-8b-instruct
Pass model as E.g. For 'Huggingface' inference endpoints pass in `completion(model='huggingface/starcoder',..)` Learn more: https://docs.litellm.ai/docs/providers

We would appreciate guidance on how to resolve this issue. Specifically:

How can we configure Guardrails Server to recognize and correctly interact with our custom LLaMA model endpoint?
What changes are necessary to properly pass the model and provider information?
Is there an example for setting up a custom LLM callable in this context?

Thank you!

joy1007 added the documentation Improvements or additions to documentation label Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[docs] Route your openai compatible sdk base_url to the Guardrails Server #1212

[docs] Route your openai compatible sdk base_url to the Guardrails Server #1212

joy1007 commented Jan 14, 2025

dtam commented Jan 14, 2025

joy1007 commented Jan 15, 2025 •

edited

Loading

[docs] Route your openai compatible sdk base_url to the Guardrails Server #1212

[docs] Route your openai compatible sdk base_url to the Guardrails Server #1212

Comments

joy1007 commented Jan 14, 2025

dtam commented Jan 14, 2025

joy1007 commented Jan 15, 2025 • edited Loading

joy1007 commented Jan 15, 2025 •

edited

Loading