Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAI output null values for keys are converted to True in validated response #479

Closed
LakshmiPanguluri opened this issue Dec 1, 2023 · 9 comments
Labels
bug Something isn't working urgent

Comments

@LakshmiPanguluri
Copy link

LakshmiPanguluri commented Dec 1, 2023

Describe the bug
OpenAI output null values for keys are converted to True in the validated response

To Reproduce
Steps to reproduce the behavior:

from typing import List, Dict, Optional, Any, Union

import openai
from pydantic import BaseModel as BaseModel, ConfigDict, Field
import sys

from guardrails.validators import ValidChoices

if sys.version_info >= (3, 8):
    from typing import Literal
else:
    from typing_extensions import Literal

metamodel_version = "None"
version = "1.0"

class ConfiguredBaseModel(BaseModel):
    model_config = ConfigDict(
    validate_assignment=True,
    validate_default=True,
    extra='forbid',
    arbitrary_types_allowed=True,
    use_enum_values = True)

class Person(ConfiguredBaseModel):
    """
    A human being regarded as an individual.
    """
    name: str = Field(..., description="""Name of the person""")
    birth_date: Optional[str] = Field(None, description="""date of birth of person""")
    death_date: Optional[str] = Field(None, description="""date of death of person""")
    salary: Optional[str] = Field(None, description="""salary of the given person""")
    occupation: Optional[str] = Field(None, description="""salary of the given person""", validators=[ValidChoices(choices=['Politician'], on_fail='noop')])

prompt = """Extract person information from the give text. Each extracted value should be valid.

text= "Elon Reeve Musk (born June 28, 1971) is a businessman and investor. Musk is the founder, chairman, CEO and chief technology officer of SpaceX; angel investor, CEO, product architect and former chairman of Tesla, Inc.; owner, chairman and CTO of X Corp.; founder of the Boring Company and xAI; co-founder of Neuralink and OpenAI; and president of the Musk Foundation. He is the wealthiest person in the world, with an estimated net worth of US$219 billion as of November 2023, according to the Bloomberg Billionaires Index, and $241 billion according to Forbes, primarily from his ownership stakes in Tesla and SpaceX."
${gr.complete_json_suffix}"""

import guardrails as gd

from rich import print

guard = gd.Guard.from_pydantic(output_class=Person, prompt=prompt)

response = guard(
    openai.chat.completions.create, model="gpt-3.5-turbo", #"gpt-4-1106-preview",
    prompt= guard.base_prompt
)

print(response)

Expected behavior
A clear and concise description of what you expected to happen.

Library version:
Version (e.g. 0.1.5)

Additional context
Add any other context about the problem here.

image

@LakshmiPanguluri LakshmiPanguluri added the bug Something isn't working label Dec 1, 2023
@LakshmiPanguluri LakshmiPanguluri changed the title OpenAI output null values for keys are OpenAI output null values for keys are converted to True in validated response Dec 1, 2023
@irgolic
Copy link
Contributor

irgolic commented Dec 4, 2023

Hi, thanks for reporting this issue, we'll look into it.

@zsimjee
Copy link
Collaborator

zsimjee commented Dec 10, 2023

My hunch here is that null is being read in as a Truthy string, and then getting translated to True. They should instead be dropped if the field is optional. if the field is mandatory, we should treat it as an error in skeletal validation.

@sudhanshu746
Copy link

@irgolic , I'm following the progress.

@tbrownio
Copy link

The offender appears to be here:

https://github.com/guardrails-ai/guardrails/blob/02ba65cd9a49e7bdbd5ecc700b0ea97d5cc551be/guardrails/utils/json_utils.py#L40C9-L41C24

By commenting out these lines, I am able to avoid this issue. That being said, I'm not totally sure what's going on here. @irgolic any ideas what this is doing?

@tbrownio
Copy link

tbrownio commented Jan 5, 2024

Hey all, any updates here?

@thekaranacharya
Copy link
Contributor

thekaranacharya commented Jan 8, 2024

Hello @LakshmiPanguluri, we support multiple json suffixes that can be added to the prompt; those can be found here.

Replacing the current ${gr.complete_json_suffix} with ${gr.complete_json_suffix_v2} works better in this case, and does not output null values in the first place. If the information is not given in the prompt itself and if the field is optional, the LLM won't output the field and it's value - that's kinda the point of optional fields right?

Also, I noticed a few typos in the prompt. Here's a corrected version of the prompt (which also includes a recommended design according to multiple prompt designing guides available online):

prompt = """Extract person information from the given text. Each extracted value should be valid.

text:
Elon Reeve Musk (born June 28, 1971) is a businessman and investor. Musk is the founder, chairman, CEO and chief technology officer of SpaceX; angel investor, CEO, product architect and former chairman of Tesla, Inc.; owner, chairman and CTO of X Corp.; founder of the Boring Company and xAI; co-founder of Neuralink and OpenAI; and president of the Musk Foundation. He is the wealthiest person in the world, with an estimated net worth of US$219 billion as of November 2023, according to the Bloomberg Billionaires Index, and $241 billion according to Forbes, primarily from his ownership stakes in Tesla and SpaceX.

${gr.complete_json_suffix_v2}
"""

(Basically, we replace the == with : and get the text on a new line. That seems to work best, as it's natural language, not code.)

This is the raw and validated outputs I received after updating the prompt:
Screenshot 2024-01-08 at 5 18 03 PM

@tbrownio Great catch!
Now as far as why we convert the optional fields' values to True if null, needs some more research. Let us get back to you on this as soon as we can.

@thekaranacharya
Copy link
Contributor

thekaranacharya commented Jan 12, 2024

@tbrownio We looked into this in a bit more detail. When the field is optional and the value is null, we want to exit earlier and hence we return True from the base class - Placeholder's verify method. The boolean True stands for returning it early.

Once we get that in e.g here and multiple other places, instead of returning the super_result directly we should return None instead so as to avoid the main issue here. We're planning to add this fix soon.

@tbrownio
Copy link

tbrownio commented Jan 16, 2024 via email

@thekaranacharya
Copy link
Contributor

A fix has been added for this issue! Merged in main, for now try using the latest code from main, until it makes it way into the package PyPI.

Closing this issue. @LakshmiPanguluri @tbrownio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working urgent
Projects
None yet
Development

No branches or pull requests

7 participants