You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently if you converse with gpt vision, it will always interpret if you want to draw something or not using a separate API call, this is a second API call and they're costly so we want to be able to turn it off, but better yet, we want to be able to just remove the second API call
I was a bit (lot) stupid when designing this initially, we can avoid the second api call by just putting some instructions in the system pretext that ask it to use some sort of special syntax to denote a drawing prompt when it responds to a user if it picks up on intent to draw, and then we would just kick off the drawing from there onward, removing the api call entirely. This also increases conversation speed.
All the scaffolding for drawing is already in the code, what needs to be done is:
the LLM api call that evaluates the last few messages of history just needs to be removed
Some pretext needs to be created that can get gpt to respond with some sort of syntax when it picks up a users intent to draw for example you can try being like
If the user wants to draw something, retain a prompt for what the user wants to draw such that it can go into a generator such as DALL-E, and after responding to the user message, type the prompt for what to draw within a special sequence of characters: #%^c, for example, #%^a dog#%^
This drawing syntax pretext would have to be appended programmatically and not just in the main text of the pretext because it should work for other openers too. There should be some sort of recurring system message every X messages that acts as a reminder to do this identification of drawing intent, and in the future we can expand our agent's capabilities further on this basis
There are some drawbacks to this, the independent evaluator system should theoretically be more accurate, as it is more focused without the entire conversation context.
The text was updated successfully, but these errors were encountered:
Currently if you converse with gpt vision, it will always interpret if you want to draw something or not using a separate API call, this is a second API call and they're costly so we want to be able to turn it off, but better yet, we want to be able to just remove the second API call
I was a bit (lot) stupid when designing this initially, we can avoid the second api call by just putting some instructions in the system pretext that ask it to use some sort of special syntax to denote a drawing prompt when it responds to a user if it picks up on intent to draw, and then we would just kick off the drawing from there onward, removing the api call entirely. This also increases conversation speed.
All the scaffolding for drawing is already in the code, what needs to be done is:
This drawing syntax pretext would have to be appended programmatically and not just in the main text of the pretext because it should work for other openers too. There should be some sort of recurring system message every X messages that acts as a reminder to do this identification of drawing intent, and in the future we can expand our agent's capabilities further on this basis
There are some drawbacks to this, the independent evaluator system should theoretically be more accurate, as it is more focused without the entire conversation context.
The text was updated successfully, but these errors were encountered: