Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug fix][AIC-py] add missing global model settings on init #765

Merged
merged 1 commit into from
Jan 5, 2024

Conversation

jonathanlastmileai
Copy link
Contributor

@jonathanlastmileai jonathanlastmileai commented Jan 5, 2024

[bug fix][AIC-py] add missing global model settings on init

Adding prompt with a model name to a new AIConfig makes it invalid
because essentially the name is a dangling foreign key to the AIConfig's global
model settings mapping.

Immediate fix:

When we add prompt_1, also add a default global mapping entry for each
registered model.

Better fix: Reject updates to AIConfig that would make it invalid.
(inside add_prompt). However, this could break existing code
that adds a prompt and then immediately adds the missing model, like we're doing
in this PR.

In the future, we should make AIConfig more constrained by definition
so that this kind of invalid object can't exist to begin with.

Test: run server, type into prompt_1, it runs

Adding prompt with a model name to a new AIConfig makes it invalid
because essentially the name is a dangling foreign key to the AIConfig's global
model settings mapping.


Immediate fix:

When we add prompt_1, also add a default global mapping entry for each
registered model.


Better fix: Reject updates to AIConfig that would make it invalid.
(inside add_prompt). However, this could break existing code
that adds a prompt and then immediately adds the missing model, like we're doing
in this PR.

In the future, we should make AIConfig more constrained by definition
so that this kind of invalid object can't exist to begin with.

Test: run server, type into prompt_1, it runs
@rossdanlm
Copy link
Contributor

Better fix: Reject updates to AIConfig that would make it invalid. (inside add_prompt). However, this could break existing code that adds a prompt and then immediately adds the missing model, like we're doing in this PR.

Forcing the user to explicitly define the model in aiconfig-level is kind of unintuitive and not easy to do if you aren't familiar with the AIConfig schema

In the future, we should make AIConfig more constrained by definition so that this kind of invalid object can't exist to begin with.

This isn't strictly possible. Ex: If in editor prompt we want to choose a new model that isn't defined (which this diff fixed) in the aiconfig-level metadata.models field, there's no way of having it pre-defined (other than forcing user to enter it manually which I mentioned earlier isn't good UI experience)

For me the best solution would be to modify this on update_model, and when model name changes we should:

  1. Add the new model info to the aiconfig-level settings IF it's not already defined
  2. Remove the old model from the aiconfig-level settings IF no other prompts use it

1 is necessary, 2 is nice to have. Either way, this diff unblocks us for editor (except for custom model parsers), but I will mark this as P1. Do you want me to own this followup (post-MVP) or do you want to do it?

Extra, beyond scope of this diff

There's also a lot of "initial state" considerations, but for now I think this diff is good to get us unblocked. Ex: I think instead of initializing every model, we should go through each initial prompt, make a list of all used models, and add those if not defined (and if those models are one of the valid core models that we support). We can also do if a prompt has not been created, for the first time we can default to gpt-3 and also add that as well (so yea, you're right and we'd probably need to insert some custom functionality to add_prompt to check for this edge case)

Copy link
Contributor

@rossdanlm rossdanlm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, big unblocker!

@jonathanlastmileai jonathanlastmileai merged commit 1625d10 into main Jan 5, 2024
2 checks passed
if aiconfig_runtime.metadata.models is None:
aiconfig_runtime.metadata.models = {}
if model_id not in aiconfig_runtime.metadata.models:
aiconfig_runtime.add_model(model_id, {"model": model_id})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this going to set {"model"} into the completion params for all models by default now? For example, for HuggingFaceTextGenerationParser the settings would have {"model": "HuggingFaceTextGenerationParser"} in this case? I don't think that's what we want, right?

Isn't this issue only a problem for those model parsers that actually support / need the 'model' in their completion params? If so, I think we should resolve this at the model parser level.

e.g. for openai deserialize:

model_settings = self.get_model_settings(prompt, aiconfig) or {"model": aiconfig.get_model_name(prompt)})

cc @jonathanlastmileai , @rossdanlm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, good point. I'm a bit confused again. Let's start from here: let's write a minimum example of a valid serialized AIConfig for GPT4, and another valid serialized AIConfig for a model that doesn't put a "model" key into the underlying API - maybe HF text generation. Let's get crystal clear on why each one is valid and why alternatives would be invalid.

GPT4

{
  "metadata": {
    "models": {
      "gpt-4": {
        "model": "gpt-4"
      }
    }
  },
  "prompts": [
    {
      "name": "get_activities",
      "input": "Tell me 10 fun attractions to do in NYC."
      "metadata": {
        "model": "gpt-4",
      }
    }
  ]
}

I think this is valid because the string-valued prompt.metadata.model points to an existing metadata.models key whose value is a dict.

This is an invalid variant of what we have above, what we were seeing before:

{
  "metadata": {
    "models": {
    }
  },
  "prompts": [
    {
      "name": "get_activities",
      "input": "Tell me 10 fun attractions to do in NYC."
      "metadata": {
        "model": "gpt-4",
      }
    }
  ]
}

Right so far?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that's correct, here's something I'm not clear on: what is a valid and invalid equivalent for another model e.g. HuggingFaceTextGenerationParser ?

By the design of AIConfig, shouldn't it just be the same as for GPT4 but with the name changed?

@jonathanlastmileai jonathanlastmileai deleted the pr765 branch January 5, 2024 15:30
@jonathanlastmileai
Copy link
Contributor Author

Better fix: Reject updates to AIConfig that would make it invalid. (inside add_prompt). However, this could break existing code that adds a prompt and then immediately adds the missing model, like we're doing in this PR.

Forcing the user to explicitly define the model in aiconfig-level is kind of unintuitive and not easy to do if you aren't familiar with the AIConfig schema

In the future, we should make AIConfig more constrained by definition so that this kind of invalid object can't exist to begin with.

This isn't strictly possible. Ex: If in editor prompt we want to choose a new model that isn't defined (which this diff fixed) in the aiconfig-level metadata.models field, there's no way of having it pre-defined (other than forcing user to enter it manually which I mentioned earlier isn't good UI experience)

For me the best solution would be to modify this on update_model, and when model name changes we should:

  1. Add the new model info to the aiconfig-level settings IF it's not already defined
  2. Remove the old model from the aiconfig-level settings IF no other prompts use it

1 is necessary, 2 is nice to have. Either way, this diff unblocks us for editor (except for custom model parsers), but I will mark this as P1. Do you want me to own this followup (post-MVP) or do you want to do it?

Extra, beyond scope of this diff

There's also a lot of "initial state" considerations, but for now I think this diff is good to get us unblocked. Ex: I think instead of initializing every model, we should go through each initial prompt, make a list of all used models, and add those if not defined (and if those models are one of the valid core models that we support). We can also do if a prompt has not been created, for the first time we can default to gpt-3 and also add that as well (so yea, you're right and we'd probably need to insert some custom functionality to add_prompt to check for this edge case)

Let's P1 this discussion here https://docs.google.com/document/d/1YwUlIQZX8CiuvtbVR4OvpFQ2nphFITRD_XZR3CDNtoE/edit#bookmark=id.ipu45nan09yk

saqadri added a commit that referenced this pull request Jan 5, 2024
Revert #765 to fix properly in the AIConfig SDK
@saqadri saqadri mentioned this pull request Jan 5, 2024
saqadri added a commit that referenced this pull request Jan 5, 2024
Revert #765

Revert #765 to fix properly in the AIConfig SDK
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants