Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[P1] Support more huggingface (transformer-based) models #46

Open
frankaging opened this issue Jan 13, 2024 · 3 comments
Open

[P1] Support more huggingface (transformer-based) models #46

frankaging opened this issue Jan 13, 2024 · 3 comments
Assignees
Labels
good first issue Good for newcomers

Comments

@frankaging
Copy link
Collaborator

frankaging commented Jan 13, 2024

Descriptions:
Ideally, all the models listed here can be supported by this library without exposing the model details to the users of this library.

This requires we set up model folders for all model types and write config metadata for each of them annotating where to do interventions. This requires a lot of effort. This is a PR tracking the process towards the goal of supporting as many as we can.

Each model should take less than an hour to (1) configure and (2) write simple unit tests.

Here is the list of models that are in the pipeline to support (in order):

  • BERT-family
    • RoBERTa
    • DeBERTa
    • ELECTRA
  • xlm (multilingual model)
  • t5
  • Mistral
  • Mixtral (MoE, MixtralForCausalLM)
  • Phi
  • Mamba (but need to support recurrent interventions, not just layerwise interventions)
  • backpack-gpt2
  • please feel free to suggest other new models to support!
@frankaging
Copy link
Collaborator Author

Update: most of the works are completed, tracked here: https://github.com/stanfordnlp/pyvene/tree/dev/models and https://github.com/stanfordnlp/pyvene/tree/peterwz

@SubramanyamSahoo
Copy link

@frankaging, I see that we have an extensive list of models to support. Would it be beneficial to prioritize them based on their popularity or relevance to the community? This could help us focus our efforts more effectively and ensure that we address the most pressing needs first.

@SubramanyamSahoo
Copy link

SubramanyamSahoo commented Apr 1, 2024

  • models

Certainly! Here are some additional transformer-based models that could be considered for support:

GPT-3
OpenAI Codex
GPT-Neo
GPT-J
BART (Bidirectional and Auto-Regressive Transformers)
Groq

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

3 participants