Introducing the Iterative Trainer #737

gaetanlop · 2023-09-04T21:10:29Z

This PR is a follow-up to a requested Iterative Trainer in #704 and #576. It introduces a way to finetune models with methods that require some steps between optimizations.

HuggingFaceDocBuilderDev · 2023-09-04T21:16:07Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

gaetanlop · 2023-09-05T00:01:23Z

@younesbelkada @lvwerra what do you think of the structure of the trainer? Should we add a generate function or should the generation step be done outside of the trainer?

gaetanlop · 2023-09-12T02:27:21Z

Hello @younesbelkada @lvwerra, any news on this? I think that's a required step before adding an example for #704

younesbelkada

Thanks a lot for your great work !
Design wise the PR looks great, I just left minor questions, I expect this API to be public, therefore we need to add some documentation, would you be happy doing so? I can also help you if any !
Thanks!

younesbelkada · 2023-09-14T08:44:17Z

trl/trainer/iterative_trainer.py

+    Attributes:
+        **config** (`IterativeConfig`) -- Configuration object for IterativeTrainer.
+        **model** (`PreTrainedModel`) -- Model to be optimized, Hugging Face transformer model with a causal language modeling head.
+            Check the documentation of `PreTrainedModelWrapper` for more details.


Suggested change

Check the documentation of `PreTrainedModelWrapper` for more details.

Here we're using a PreTrainedModel right?

younesbelkada · 2023-09-14T08:44:55Z

trl/trainer/iterative_trainer.py

+            raise ValueError(
+                f"tokenizer must be a PreTrainedTokenizerBase like a PreTrainedTokenizer or a PreTrainedTokenizerFast, got {type(tokenizer)}"
+            )
+


You can also check here is the model is an instance of PreTrainedModel

younesbelkada · 2023-09-14T08:47:53Z

trl/trainer/iterative_trainer.py

+        self.is_encoder_decoder = hasattr(self.model, "is_encoder_decoder")
+
+    def prepare_model_inputs(self, input_ids: torch.Tensor, attention_mask: torch.Tensor, labels: torch.Tensor):
+        if self.is_encoder_decoder:


Can it be an encoder decoder? from the docstring above it seems on decoder-based models are supported

younesbelkada · 2023-09-14T08:48:17Z

trl/trainer/iterative_trainer.py

+                f"tokenizer must be a PreTrainedTokenizerBase like a PreTrainedTokenizer or a PreTrainedTokenizerFast, got {type(tokenizer)}"
+            )
+
+        # Step 1: Initialize Accelerator


You can also add a stronger check, one can check if model.can_generate() and pass a warning if that method returns False

gaetanlop · 2023-09-15T03:27:01Z

Hello @younesbelkada thanks for your feedback, I have added the requested modifications to handle seq2seq models and added some documentation. Also, I have added:

Made the attention_mask and the labels optional in the step function (labels should only be needed for encoder-decoder models).
a safety checker method to verify that input_ids, attention_mask and labels are list of tensors.
a logging method to be able to log metrics during training.

It should be ready to be merged.

younesbelkada

Thanks a lot for your great work @gaetanlop !! Looks very clean to me !
I left three comments, otherwise looks really great ! Looking forward to merge this PR !

docs/source/trainer.mdx

trl/trainer/iterative_config.py

trl/trainer/iterative_trainer.py

tests/test_iterative_trainer.py

gaetanlop · 2023-09-18T22:43:45Z

Hey @younesbelkada, thanks for your feedback. I made the requested changes.

gaetanlop · 2023-09-27T14:47:22Z

Hi @younesbelkada, still interested by this trainer? I don't think this will be useful for #704 as the SFTTrainer will do the job for the training phase, but that would be useful for rejection sampling and for this paper from google deepmind (https://arxiv.org/pdf/2306.13649.pdf) which seems to be the new state of the art for LLMs knowledge distillation. It uses on policy generated samples for each optimization steps.

lvwerra

Hi @gaetanlop, sorry for the delay, people working on TRL have been a bit busy with other projects the past few weeks. We plan to do a release this week and adding this PR would be nice (if you have time of course).

My main two points:

should we inherit from Trainer to get a lot of upstream functionality for free
there is some preprocessing missing I believe making sure we pad/truncate sequence

Let me know what you think. cc @younesbelkada

lvwerra · 2023-10-31T10:11:51Z

trl/trainer/iterative_sft_config.py

+
+
+@dataclass
+class IterativeSFTConfig(object):


note that we ported the CLI args to tyro since you opened the PR. not a big change, you can for example look at https://github.com/huggingface/trl/blob/main/trl/trainer/ppo_config.py

I have removed the IterativeSFTConfig as it was not useful anymore as we inherit from Trainer

trl/trainer/iterative_sft_trainer.py

lvwerra · 2023-10-31T10:37:59Z

trl/trainer/iterative_sft_trainer.py

+        batch_dict = {}
+        batch_dict.update(model_inputs)
+
+        def collator(data):


I think we are missing a bit of preprocessing here.

truncation of long docs

padding if not all sequences have the same length

We could use the DataCollatorForLanguageModeling to implement some of that logic. Maybe we need to pass some additional kwargs to the step

Padding is already done inside prepare_model_inputs. I have added truncation

younesbelkada

Thanks a lot for your great work ! Agreed with all the points shared by @lvwerra

My comments are similar than @lvwerra 's comments:

1- Let's inherit from transformers.Trainer to benefit from all features from the trainer, in the current implementation saving / pushing to hub mechanism are missing for instance

2- let's properly tokenize a batch of sequences instead of tokenizing one by one

3- Let's use explcit arguments in step to make sure we avoid unexpected behaviours

4- We should use DataCollatorForLanguageModeling to properly handle padding

5- We can't take model.is_peft_model as model is a PreTrainedModel , I proposed an alternative to check if the model is a peft model.

Thanks!

trl/trainer/iterative_sft_trainer.py

gaetanlop · 2023-11-01T02:01:12Z

Hi @younesbelkada @lvwerra, thanks for the review. I have made the changes.

The padding part was already done inside the prepare_model_inputs function. However, truncation was not set. I have added two keyword arguments to the init (truncation mode and max_length). If the user provides texts instead of input ids then the truncation is directly done when tokenizing. In any case, we also truncate inside the prepare_model_inputs function in case the user didn't truncate its model_inputs before passing it to the step function.

I have removed the IterativeSFTConfig as we can directly use the TrainingArguments if we inherit from the HF Trainer.

younesbelkada

Looking great, thanks! I left one question, I think that is_encoder_decoder has 0 effect as it is an attribute of PreTrainedModelWrapper: https://github.com/huggingface/trl/blob/main/trl/models/modeling_value_head.py#L286
Can you try to remove it and see if the tests pass? Or alternatively try the alternative I suggested

younesbelkada · 2023-11-01T13:16:34Z

trl/trainer/iterative_sft_trainer.py

+                "When no scheduler is provided, you need to set the total number of training steps to perform `max_steps`"
+            )
+
+        self.is_encoder_decoder = hasattr(model, "is_encoder_decoder")


Suggested change

self.is_encoder_decoder = hasattr(model, "is_encoder_decoder")

self.is_encoder_decoder = getattr(model.config, "is_encoder_decoder", False)

Indeed, that's an error

younesbelkada · 2023-11-01T13:17:06Z

trl/trainer/iterative_sft_trainer.py

+                "When no scheduler is provided, you need to set the total number of training steps to perform `max_steps`"
+            )
+
+        self.is_encoder_decoder = hasattr(model, "is_encoder_decoder")


Suggested change

self.is_encoder_decoder = hasattr(model, "is_encoder_decoder")

I also wonder if this is needed at first place?

Yes we need it to decide which collator to use in case the user didn't specify one in the init

gaetanlop · 2023-11-01T14:06:26Z

@younesbelkada Looks like tests do not start for the last commit

lvwerra

Almost go to merge I would say! One question remaining regarding evaluation. I think we can probably just keep track of how many optimization steps have been run and call the Trainer evaluation when it's time.

lvwerra · 2023-11-01T16:29:21Z

trl/trainer/iterative_sft_trainer.py

+        **optimizers** (`Tuple[torch.optim.Optimizer, torch.optim.lr_scheduler.LambdaLR]`): -- The optimizer and scheduler to use for training.
+        **data_collator** (Union[DataCollatorForLanguageModeling, DataCollatorForSeq2Seq], *optional*) -- Data collator to be used for training and
+            passed along the dataloader.
+        **eval_dataset** (`datasets.Dataset`): The dataset to use for evaluation.


Currently, we never evaluate, right?

Indeed, the user had to call the evaluate function of the iterative trainer. I have made some changes

lvwerra · 2023-11-01T16:30:29Z

trl/trainer/iterative_sft_trainer.py

+            optimizers=optimizers,
+            preprocess_logits_for_metrics=preprocess_logits_for_metrics,
+        )
+
+        self.optimizer, self.lr_scheduler = optimizers
+


isnt' that redundant? the parent class should set the optimizer/scheduler, no?

Indeed, thanks for catching this, I have removed and made the necessary changes

gaetanlop · 2023-11-02T02:40:20Z

I made some changes to enable simple logging and evaluation. The function _maybe_log_save_evaluate of the HF Trainer could not be used in our case so I made a simple version of the function that has the following behavior:

We log and evaluate only if the user has specified logging_steps and eval_steps respectively (we do not have access to the number of epochs so step args are the only thing that matters).
Logging and evaluation are done every logging_steps and eval_steps respectively.
We only keep track of the loss and the learning rate.

The tests should be fixed.

lvwerra

Thanks! LGTM! 🚀

younesbelkada

Thanks a lot for your great work! Let's 🚢 it !

* initial skeleton * iterative trainer for decoder only * iterative trainer unittest * encoder_decoder support * fix typo in unittest * init * fix typo * fix init typo * adding loggings and safety checker * fixed minor issues * doc * table of contents update * add test for seq2seq2 models * change year * adding text as step input * precommit * fixing typo * run precommit * fixing typo in safety checker * fix text tokenization issue * add truncate and inherit from trainer * remove iterative config from tests * remove iterative config from init * fix peft model * change truncation side based on truncation_mode * removed iterativeconfig autodoc * fixed typo in trainer.mdx * remove mention of iterative config in docs * make sure optimizer and scheduler are created * adding max_steps to test * remove log_stats fn * remove compute loss * fixing encoder decoder detection * fix PPODecorator * run precommit * fix testing * fix small typos in iterative trainer * adapted function log and eval

initial skeleton

7827003

gaetanlop marked this pull request as draft September 4, 2023 21:10

gaetanlop changed the title ~~[WIP] Introducing the Iterative Trainer~~ [WIP] Introducing an Iterative Trainer Sep 4, 2023

gaetanlop added 6 commits September 4, 2023 18:37

iterative trainer for decoder only

0442b6b

iterative trainer unittest

4255b0b

encoder_decoder support

5648602

fix typo in unittest

faaae0a

init

2836807

fix typo

3063a2f

gaetanlop mentioned this pull request Sep 5, 2023

[WIP] Reward ranked finetuning (RAFT) and Reinforced Self-Training (ReST) #704

Closed

fix init typo

23888e1

gaetanlop changed the title ~~[WIP] Introducing an Iterative Trainer~~ Introducing the Iterative Trainer Sep 6, 2023

Merge branch 'main' into iterativetrainer

ddc573a

gaetanlop marked this pull request as ready for review September 6, 2023 00:27

gaetanlop marked this pull request as draft September 6, 2023 01:16

younesbelkada reviewed Sep 14, 2023

View reviewed changes

gaetanlop added 3 commits September 14, 2023 22:23

adding loggings and safety checker

3705578

fixed minor issues

aa842c3

doc

f2673e7

table of contents update

2de2388

gaetanlop marked this pull request as ready for review September 15, 2023 03:29

younesbelkada reviewed Sep 18, 2023

View reviewed changes

docs/source/trainer.mdx Outdated Show resolved Hide resolved

trl/trainer/iterative_config.py Outdated Show resolved Hide resolved

trl/trainer/iterative_trainer.py Outdated Show resolved Hide resolved

tests/test_iterative_trainer.py Outdated Show resolved Hide resolved

gaetanlop added 2 commits September 18, 2023 18:39

add test for seq2seq2 models

b61cc9a

change year

a8eee4b

lvwerra reviewed Oct 31, 2023

View reviewed changes

younesbelkada reviewed Oct 31, 2023

View reviewed changes

trl/trainer/iterative_sft_trainer.py Outdated Show resolved Hide resolved

trl/trainer/iterative_sft_trainer.py Outdated Show resolved Hide resolved

trl/trainer/iterative_sft_trainer.py Outdated Show resolved Hide resolved

gaetanlop added 4 commits October 31, 2023 21:42

add truncate and inherit from trainer

11c69e2

remove iterative config from tests

57622f8

remove iterative config from init

c21091b

fix peft model

b4600fc

gaetanlop added 8 commits October 31, 2023 22:03

change truncation side based on truncation_mode

4121cbe

removed iterativeconfig autodoc

441b05c

fixed typo in trainer.mdx

3b28163

remove mention of iterative config in docs

3907102

make sure optimizer and scheduler are created

e31c3a5

adding max_steps to test

c4cd798

remove log_stats fn

78d5879

remove compute loss

5b2da4e

younesbelkada reviewed Nov 1, 2023

View reviewed changes

gaetanlop added 2 commits November 1, 2023 09:26

fixing encoder decoder detection

5de3ddf

fix PPODecorator

4e68b76

run precommit

122494e

lvwerra reviewed Nov 1, 2023

View reviewed changes

gaetanlop added 3 commits November 1, 2023 21:51

fix testing

71c28fd

fix small typos in iterative trainer

00a89ca

adapted function log and eval

aa70ca3

lvwerra approved these changes Nov 2, 2023

View reviewed changes

younesbelkada approved these changes Nov 2, 2023

View reviewed changes

younesbelkada merged commit cc1de98 into huggingface:main Nov 2, 2023

RishabhMaheshwary mentioned this pull request Jul 10, 2024

Support for Iterative DPO #1824

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introducing the Iterative Trainer #737

Introducing the Iterative Trainer #737

gaetanlop commented Sep 4, 2023

HuggingFaceDocBuilderDev commented Sep 4, 2023

gaetanlop commented Sep 5, 2023

gaetanlop commented Sep 12, 2023

younesbelkada left a comment

younesbelkada Sep 14, 2023

younesbelkada Sep 14, 2023

younesbelkada Sep 14, 2023

younesbelkada Sep 14, 2023

younesbelkada Sep 14, 2023

gaetanlop commented Sep 15, 2023

younesbelkada left a comment •

edited

Loading

gaetanlop commented Sep 18, 2023

gaetanlop commented Sep 27, 2023 •

edited

Loading

lvwerra left a comment

lvwerra Oct 31, 2023

gaetanlop Nov 1, 2023

lvwerra Oct 31, 2023

gaetanlop Nov 1, 2023

younesbelkada left a comment

gaetanlop commented Nov 1, 2023

younesbelkada left a comment

younesbelkada Nov 1, 2023

gaetanlop Nov 1, 2023

younesbelkada Nov 1, 2023

gaetanlop Nov 1, 2023

gaetanlop commented Nov 1, 2023 •

edited

Loading

lvwerra left a comment

lvwerra Nov 1, 2023

gaetanlop Nov 2, 2023

lvwerra Nov 1, 2023

gaetanlop Nov 2, 2023

gaetanlop commented Nov 2, 2023 •

edited

Loading

lvwerra left a comment

younesbelkada left a comment

	self.is_encoder_decoder = hasattr(model, "is_encoder_decoder")
	self.is_encoder_decoder = getattr(model.config, "is_encoder_decoder", False)

Introducing the Iterative Trainer #737

Introducing the Iterative Trainer #737

Conversation

gaetanlop commented Sep 4, 2023

HuggingFaceDocBuilderDev commented Sep 4, 2023

gaetanlop commented Sep 5, 2023

gaetanlop commented Sep 12, 2023

younesbelkada left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gaetanlop commented Sep 15, 2023

younesbelkada left a comment • edited Loading

Choose a reason for hiding this comment

gaetanlop commented Sep 18, 2023

gaetanlop commented Sep 27, 2023 • edited Loading

lvwerra left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

younesbelkada left a comment

Choose a reason for hiding this comment

gaetanlop commented Nov 1, 2023

younesbelkada left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gaetanlop commented Nov 1, 2023 • edited Loading

lvwerra left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gaetanlop commented Nov 2, 2023 • edited Loading

lvwerra left a comment

Choose a reason for hiding this comment

younesbelkada left a comment

Choose a reason for hiding this comment

younesbelkada left a comment •

edited

Loading

gaetanlop commented Sep 27, 2023 •

edited

Loading

gaetanlop commented Nov 1, 2023 •

edited

Loading

gaetanlop commented Nov 2, 2023 •

edited

Loading