OpenAI output null values for keys are converted to True in validated response #479

LakshmiPanguluri · 2023-12-01T15:12:10Z

Describe the bug
OpenAI output null values for keys are converted to True in the validated response

To Reproduce
Steps to reproduce the behavior:

from typing import List, Dict, Optional, Any, Union

import openai
from pydantic import BaseModel as BaseModel, ConfigDict, Field
import sys

from guardrails.validators import ValidChoices

if sys.version_info >= (3, 8):
    from typing import Literal
else:
    from typing_extensions import Literal

metamodel_version = "None"
version = "1.0"

class ConfiguredBaseModel(BaseModel):
    model_config = ConfigDict(
    validate_assignment=True,
    validate_default=True,
    extra='forbid',
    arbitrary_types_allowed=True,
    use_enum_values = True)

class Person(ConfiguredBaseModel):
    """
    A human being regarded as an individual.
    """
    name: str = Field(..., description="""Name of the person""")
    birth_date: Optional[str] = Field(None, description="""date of birth of person""")
    death_date: Optional[str] = Field(None, description="""date of death of person""")
    salary: Optional[str] = Field(None, description="""salary of the given person""")
    occupation: Optional[str] = Field(None, description="""salary of the given person""", validators=[ValidChoices(choices=['Politician'], on_fail='noop')])

prompt = """Extract person information from the give text. Each extracted value should be valid.

text= "Elon Reeve Musk (born June 28, 1971) is a businessman and investor. Musk is the founder, chairman, CEO and chief technology officer of SpaceX; angel investor, CEO, product architect and former chairman of Tesla, Inc.; owner, chairman and CTO of X Corp.; founder of the Boring Company and xAI; co-founder of Neuralink and OpenAI; and president of the Musk Foundation. He is the wealthiest person in the world, with an estimated net worth of US$219 billion as of November 2023, according to the Bloomberg Billionaires Index, and $241 billion according to Forbes, primarily from his ownership stakes in Tesla and SpaceX."
${gr.complete_json_suffix}"""

import guardrails as gd

from rich import print

guard = gd.Guard.from_pydantic(output_class=Person, prompt=prompt)

response = guard(
    openai.chat.completions.create, model="gpt-3.5-turbo", #"gpt-4-1106-preview",
    prompt= guard.base_prompt
)

print(response)

Expected behavior
A clear and concise description of what you expected to happen.

Library version:
Version (e.g. 0.1.5)

Additional context
Add any other context about the problem here.

irgolic · 2023-12-04T14:00:13Z

Hi, thanks for reporting this issue, we'll look into it.

zsimjee · 2023-12-10T21:42:03Z

My hunch here is that null is being read in as a Truthy string, and then getting translated to True. They should instead be dropped if the field is optional. if the field is mandatory, we should treat it as an error in skeletal validation.

sudhanshu746 · 2023-12-11T10:07:29Z

@irgolic , I'm following the progress.

tbrownio · 2023-12-27T05:46:55Z

The offender appears to be here:

https://github.com/guardrails-ai/guardrails/blob/02ba65cd9a49e7bdbd5ecc700b0ea97d5cc551be/guardrails/utils/json_utils.py#L40C9-L41C24

By commenting out these lines, I am able to avoid this issue. That being said, I'm not totally sure what's going on here. @irgolic any ideas what this is doing?

tbrownio · 2024-01-05T21:06:08Z

Hey all, any updates here?

thekaranacharya · 2024-01-08T22:17:00Z

Hello @LakshmiPanguluri, we support multiple json suffixes that can be added to the prompt; those can be found here.

Replacing the current ${gr.complete_json_suffix} with ${gr.complete_json_suffix_v2} works better in this case, and does not output null values in the first place. If the information is not given in the prompt itself and if the field is optional, the LLM won't output the field and it's value - that's kinda the point of optional fields right?

Also, I noticed a few typos in the prompt. Here's a corrected version of the prompt (which also includes a recommended design according to multiple prompt designing guides available online):

prompt = """Extract person information from the given text. Each extracted value should be valid.

text:
Elon Reeve Musk (born June 28, 1971) is a businessman and investor. Musk is the founder, chairman, CEO and chief technology officer of SpaceX; angel investor, CEO, product architect and former chairman of Tesla, Inc.; owner, chairman and CTO of X Corp.; founder of the Boring Company and xAI; co-founder of Neuralink and OpenAI; and president of the Musk Foundation. He is the wealthiest person in the world, with an estimated net worth of US$219 billion as of November 2023, according to the Bloomberg Billionaires Index, and $241 billion according to Forbes, primarily from his ownership stakes in Tesla and SpaceX.

${gr.complete_json_suffix_v2}
"""

(Basically, we replace the == with : and get the text on a new line. That seems to work best, as it's natural language, not code.)

This is the raw and validated outputs I received after updating the prompt:

@tbrownio Great catch!
Now as far as why we convert the optional fields' values to True if null, needs some more research. Let us get back to you on this as soon as we can.

thekaranacharya · 2024-01-12T17:45:10Z

@tbrownio We looked into this in a bit more detail. When the field is optional and the value is null, we want to exit earlier and hence we return True from the base class - Placeholder's verify method. The boolean True stands for returning it early.

Once we get that in e.g here and multiple other places, instead of returning the super_result directly we should return None instead so as to avoid the main issue here. We're planning to add this fix soon.

tbrownio · 2024-01-16T17:53:20Z

Thanks for following up here, and I'm looking forward to a fix. Tyler Brown ***@***.*** Seattle, WA https://tbrown.io ( https://tbrown.io/ ) https://twitter.com/tbrownio https://linkedin.com/in/tbrownio ( https://www.linkedin.com/in/tbrownio/ )

…

On Fri, Jan 12, 2024 at 9:45 AM, Karan Acharya < ***@***.*** > wrote: @ tbrownio ( https://github.com/tbrownio ) We looked into this in a bit more detail. When the field is optional and the value is null , we want to exit earlier and hence we return True from the base class - Placeholder 's verify method. Once we get that in e.g here ( https://github.com/guardrails-ai/guardrails/blob/8833c0507d58825b2712d7660b14efff7dc69855/guardrails/utils/json_utils.py#L129-L130 ) and multiple other places, instead of returning the super_result directly we should return None instead so as to avoid the main issue here. We're planning to add this fix soon. — Reply to this email directly, view it on GitHub ( #479 (comment) ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/AC3XTDDTX2JE3ESZJOGV72LYOFZDBAVCNFSM6AAAAABAC7QZISVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBZG4YTENBZGE ). You are receiving this because you were mentioned. Message ID: <guardrails-ai/guardrails/issues/479/1889712491 @ github. com>

thekaranacharya · 2024-02-28T16:47:27Z

A fix has been added for this issue! Merged in main, for now try using the latest code from main, until it makes it way into the package PyPI.

Closing this issue. @LakshmiPanguluri @tbrownio

LakshmiPanguluri added the bug Something isn't working label Dec 1, 2023

LakshmiPanguluri changed the title ~~OpenAI output null values for keys are~~ OpenAI output null values for keys are converted to True in validated response Dec 1, 2023

smohiuddin added the urgent label Jan 12, 2024

thekaranacharya mentioned this issue Feb 27, 2024

Bugfix: Raw LLM response: null -> True #604

Merged

thekaranacharya closed this as completed Feb 28, 2024

CalebCourier mentioned this issue Mar 14, 2024

[bug] JSON validation converts null values to True:bool #641

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAI output null values for keys are converted to True in validated response #479

OpenAI output null values for keys are converted to True in validated response #479

LakshmiPanguluri commented Dec 1, 2023 •

edited by irgolic

Loading

irgolic commented Dec 4, 2023

zsimjee commented Dec 10, 2023

sudhanshu746 commented Dec 11, 2023

tbrownio commented Dec 27, 2023

tbrownio commented Jan 5, 2024

thekaranacharya commented Jan 8, 2024 •

edited

Loading

thekaranacharya commented Jan 12, 2024 •

edited

Loading

tbrownio commented Jan 16, 2024 via email

thekaranacharya commented Feb 28, 2024

OpenAI output null values for keys are converted to True in validated response #479

OpenAI output null values for keys are converted to True in validated response #479

Comments

LakshmiPanguluri commented Dec 1, 2023 • edited by irgolic Loading

irgolic commented Dec 4, 2023

zsimjee commented Dec 10, 2023

sudhanshu746 commented Dec 11, 2023

tbrownio commented Dec 27, 2023

tbrownio commented Jan 5, 2024

thekaranacharya commented Jan 8, 2024 • edited Loading

thekaranacharya commented Jan 12, 2024 • edited Loading

tbrownio commented Jan 16, 2024 via email

thekaranacharya commented Feb 28, 2024

LakshmiPanguluri commented Dec 1, 2023 •

edited by irgolic

Loading

thekaranacharya commented Jan 8, 2024 •

edited

Loading

thekaranacharya commented Jan 12, 2024 •

edited

Loading