Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChatUser does not comply with the jupyter identity model #1025

Open
krassowski opened this issue Oct 11, 2024 · 9 comments
Open

ChatUser does not comply with the jupyter identity model #1025

krassowski opened this issue Oct 11, 2024 · 9 comments
Labels
bug Something isn't working

Comments

@krassowski
Copy link
Member

Description

ChatUser is defined here:

class ChatUser(BaseModel):
# User ID assigned by IdentityProvider.
username: str
initials: str
name: str
display_name: str
color: Optional[str]
avatar_url: Optional[str]

The comment says that the username should come from the identity provider. But it does not, it is instead taken from getpass.getuser():

login = getpass.getuser()
initials = login[0].capitalize()
return ChatUser(
username=login,
initials=initials,

This means that the username accessible via app.services.user.identity.username does not match the username stored on the messages. This means that request from @dlqqq in #1022 (review) cannot be implemented cleanly.

See https://jupyterlab.readthedocs.io/en/stable/extension/identity.html#identity

Reproduce

image

Expected behavior

The identity from the IdentityProvider is used.

@dlqqq
Copy link
Member

dlqqq commented Oct 14, 2024

@krassowski This was a deliberate design decision that was made to avoid showing an "Anonymous XYZ" username by default, even when jupyter_collaboration is not installed. The rationale is that when jupyter_collaboration is not installed, identity usernames from the default IdentityProvider are not shown anywhere else in the UI, which could greatly confuse users. Therefore, Jupyter AI currently defaults to the POSIX login name instead of the username from IdentityProvider when jupyter_collaboration is not installed.

We do have an open issue for adding a configurable traitlet to always use IdentityProvider to get usernames: #866. This has not been implemented.

Changing the code to always use IdentityProvider usernames could be a breaking change, since this would be a major change in the UX for all users without jupyter_collaboration. @ellisonbg Thoughts on this?

@krassowski
Copy link
Member Author

There are three properties: username, name and display_name. Only username is annotated as being provided by IdentityProvider in the code (but it is not). I don't care about the other two but I think the current state is wrong. Either the suggestion to fix the username (as in #1022) need to be applied or the comment in the model changed.

@krassowski
Copy link
Member Author

I don't see why this would be a major change in the UX or any change at all (assuming that the rest of the codebase correctly uses display_name for user-facing display)

@ellisonbg
Copy link
Contributor

I think I understand the situation, but I have a question.

When does IdentityProvider use a "random Jupyter moon" name versus when does it pull a name from somewhere like the OS-level (POSIX on a *nix system) username?

@dlqqq
Copy link
Member

dlqqq commented Oct 14, 2024

@krassowski Ah, I just reviewed the code again and now fully understand the scope of what you're proposing. We're using the login name for both the username and display_name fields:

        login = getpass.getuser()
        initials = login[0].capitalize()
        return ChatUser(
            username=login,
            initials=initials,
            name=login,
            display_name=login,
            color=None,
            avatar_url=None,
        )

And I've verified that the frontend reads from the display_name field to render the header to human messages in chat-messages.tsx:

  const name =
    props.message.type === 'human'
      ? props.message.client.display_name
      : props.message.persona.name;

So, changing username to use the username provided by the IdentityProvider should not have any UX changes, and this is a safe change to make.

Apologies for my confusion earlier; I keep getting tripped over how the IdentityProvider calls user IDs "usernames" and usernames "display names". 🤦

@krassowski
Copy link
Member Author

When does IdentityProvider use a "random Jupyter moon" name versus when does it pull a name from somewhere like the OS-level (POSIX on a *nix system) username?

@ellisonbg:

  • the default implementations in jupyter-server (IdentityProvider and PasswordIdentityProvider) always use "random Jupyter moon" and never exposes the OS-level username.

    there is a LegacyIdentityProvider in jupyter-server which can take the user info from the login form handler, but this is implementation-dependent

  • when using JupyterHub a custom JupyterHubIdentityProvider is used instead which pulls the user details from the auth provider; for example LocalAuthenticator or LDAPAuthenticator would use POSIX or LDAP username respectively
  • on real-world deployments outside of JupyterHub administrators may swap IdentityProvider to custom sub-class

jupyter-ai currently:

  • re-uses the user information from IdentityProvider if the jupyter-collaboration extension is installed & enabled (which might be "random jupyter moon" or something from OS-level or anything else really)
  • force-sets the OS-level username otherwise

It feels to me like jupyter-ai should not be overriding the default user details. Instead if the default username provided in single-user context of default jupyter-server installation is useless (it is) maybe this should be addressed in jupyter-server instead. There are some potential privacy trade-offs to making that change but I feel the issue should be brought upstream to jupyter-server for discussion.

@krassowski
Copy link
Member Author

Apologies for my confusion earlier; I keep getting tripped over how the IdentityProvider calls user IDs "usernames" and usernames "display names". 🤦

Yeah that's slightly confusing but I also understand how "username" got to have the meaning of unique ID in this context.

For further reference:

@krassowski
Copy link
Member Author

It feels to me like jupyter-ai should not be overriding the default user details. Instead if the default username provided in single-user context of default jupyter-server installation is useless (it is) maybe this should be addressed in jupyter-server instead. There are some potential privacy trade-offs to making that change but I feel the issue should be brought upstream to jupyter-server for discussion.

And if that turns out to not be possible then I think that #866 seems like a good second-best option.

@ellisonbg
Copy link
Contributor

I think I understand the situation. On the implementation side, I believe the idea of IdentityProvider was to provide a single API for all things identity related. Thus, in the long term, it will likely only create problems if parts of Jupyter (like Jupyter AI) are diverging from just using IdentityProvider as the source of truth for identity. The Jupyter AI use case does point to the need for the IdentityProvider API to be active even when the user is not collaborating.

On the UX side, I do think that there are some cases where a single user is working and that the IdentityProvider should return a system derived username and display name. If I am using JupyterLab on my laptop and JupyterLab needs to know what to call me (for Jupyter AI or another purpose) I definitely want to be called by my human name or username rather than a random Jupyter moon. From that perspective, I do think that Jupyter AI has the right user experience for the single user case, even if its implementation isn't fully aligned with the IdentityProvider direction. In terms of how the base IdentityProvider should decide if it is a single user, maybe it could be something like (is the server only listening on localhost? || is jupyter-collaboration not installed?). If either of those two conditions are satisifed, I think it is reasonably safe to assume it is just one user (please correct me if I am missing something here).

Thanks for bringing these issues up @krassowski !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants