Implementing personalities which persist between conversations, and priority context which persists within one conversation #3402
UncleSporky
started this conversation in
Ideas
Replies: 1 comment
-
Yep this sounds totally reasonable, will give users fun things to play with and a way to make their assistant their own. Anyone interested in this sort of thing should probably read the LLM bits on https://learnprompting.org/ - they're well cited and have good general info. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I must begin by apologizing if this request is inappropriate, premature, a duplicate, or rooted in misunderstandings of the project...I am inexperienced with GitHub and the type of coding necessary for this project, but I wanted to offer this thought anyway.
Initial experiments with generative chatbots online have shown that users have gotten some utility out of asking the bot to assume various identities, thus far mostly to "jailbreak" it and force it to break its own rules. See: ChatGPT's DAN (CNBC), Bing prompt injection (Ars Technica).
Open Assistant should generally not need to be jailbroken, but the concept seems useful and potentially helpful and interesting. Therefore I propose two features: the ability for users to assign "personalities" to Open Assistant which will persist through all new conversations until halted or changed, and the ability for users to assign "priority" information to Open Assistant which will persist throughout only the current conversation until halted or changed.
I feel strongly that such additions early in the project would go far in promoting adoption and experimentation with Open Assistant. Current widespread models like ChatGPT and Bing do not offer these features, and having a minor leg up on them in this respect would draw interest. As we've seen with DAN, creating a useful or fun personality for the bot to adopt would become a type of prompt crafting all its own; users would collaborate with each other to tweak them until they perform well, and anyone could copy these personalities to give a bit of continuity of experience across communities as users compare and contrast their interactions. In a way it's akin to choosing a voice or personality for your virtual assistant or GPS -- people love this kind of customization, and these models make it very easy to do. It would be beneficial to take it a step further with Open Assistant and make it so people don't have to copy/paste a chunk of text into all their interactions.
Implementation:
As a completely uninformed layman, I can only regurgitate my best understanding of how the model works so far, so let me know if this is way off-base or prohibitive.
From what I understand, the model is limited to a certain number of tokens of information that it keeps in active memory and "thinks about" as it responds to the user. A "personality" would be a certain number of tokens set aside in the background for the model to always keep in mind, so it would not forget and break character. You would have to inform the user that the longer their personality prompt, the fewer tokens are available for ongoing memory, and to try to keep things brief ("Adopt the phrasing and mannerisms of a wise old man who has seen much and is eager to teach younger generations.") A GUI for Open Assistant might offer a drop-down somewhere of all the personalities a user has created as well as buttons to create new ones or delete old ones, but initially it might be easiest to use some sort of escape character or phrase (/personality set "Answer all questions as a Tyrannosaurus rex and end sentences with a roar.").
The simplest and most wasteful implementation would be to just append the personality to the start of every user prompt invisibly (as apparently Bing's chat does with its secret rules), though I would worry that this might interfere with communication if the model thinks the user is actually typing those words every time. It would take testing. And this would multiply the number of tokens consumed by every interaction.
"Priority" information would be handled much the same way, but just be temporary for the duration of one conversation. This would be for setting the tone and operating under a specific context, like "answer all my questions in a way that an elementary school student could understand," or "print every sentence on a separate line with a hashtag symbol at the start."
Let me know what you think and whether something like this would be feasible.
Beta Was this translation helpful? Give feedback.
All reactions