Skip to content

A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!

License

Notifications You must be signed in to change notification settings

SakanaAI/self-adaptive-llms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

03a41ae · Jan 30, 2025

History

18 Commits
Jan 14, 2025
Jan 7, 2025
Jan 7, 2025
Jan 14, 2025
Jan 7, 2025
Jan 14, 2025
Jan 7, 2025
Jan 7, 2025
Jan 14, 2025
Jan 30, 2025
Jan 7, 2025
Jan 7, 2025
Jan 7, 2025
Jan 14, 2025
Jan 14, 2025

Repository files navigation

Transformer2: Self-adaptive LLMs 🐙

📚 [Paper] | 📄 [Blog]

Self-adaptive large language models (LLMs) aim to solve the challenges posed by traditional fine-tuning methods, which are often computationally intensive and static in their ability to handle diverse tasks.

We are excited to introduce Transformer², a novel self-adaptation framework that adapts LLMs for unseen tasks in real-time by selectively adjusting only the singular components of their weight matrices. During inference, Transformer² employs a two-pass mechanism: first, a dispatch system identifies the task properties, and then task-specific "expert" vectors, trained using reinforcement learning, are dynamically mixed to obtain targeted behavior for the incoming prompt.

Installation

1. Clone the Repo

git clone https://github.com/SakanaAI/self-adaptive-llms
cd self-adaptive-llms

2. Install Libraries

conda create -n t2 python=3.11 -y
conda activate t2
pip install --upgrade pip
pip install -r requirements.txt

3. Install Tasks Evaluator

cd evaluation/fishfarm
pip install -e .

Usage

We provide example scripts for both training and evaluation.

Please change the argument in the provided script to choose among models and tasks

Training

bash scripts/train_task_expert.sh

Evaluation

Prompt-based evaluation

Classification experts can be loaded by specifying the CLS_EXPERT_PATH in the script.

bash scripts/eval_prompt_based.sh

Few-shots evaluation

bash scripts/eval_few_shot.sh

Citation

If you find Transformer^2 useful for your research, please cite using this BibTeX:

@misc{sun2025transformersquaredselfadaptivellms,
      title={Transformer-Squared: Self-adaptive LLMs}, 
      author={Qi Sun and Edoardo Cetin and Yujin Tang},
      year={2025},
      eprint={2501.06252},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2501.06252}, 
}

About

A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published