SFTTrainer using quantized model + peft does not work #1055

osanseviero · 2023-12-03T16:13:46Z

Example code from the official docs fails due to this

from datasets import load_dataset
from transformers import AutoModelForCausalLM
from trl import SFTTrainer

dataset = load_dataset("timdettmers/openassistant-guanaco", split="train")

peft_config = LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
)

model = AutoModelForCausalLM.from_pretrained(
    "EleutherAI/gpt-neo-125m",
    load_in_8bit=True,
    device_map="auto",
)

trainer = SFTTrainer(
    model,
    train_dataset=dataset,
    dataset_text_field="text",
    peft_config=peft_config,
)

Error logs

AttributeError                            Traceback (most recent call last)
[<ipython-input-3-ffcd404fb112>](https://localhost:8080/#) in <cell line: 18>()
     16 )
     17 
---> 18 trainer = SFTTrainer(
     19     model,
     20     train_dataset=dataset,

[/usr/local/lib/python3.10/dist-packages/trl/trainer/sft_trainer.py](https://localhost:8080/#) in __init__(self, model, args, data_collator, train_dataset, eval_dataset, tokenizer, model_init, compute_metrics, callbacks, optimizers, preprocess_logits_for_metrics, peft_config, dataset_text_field, packing, formatting_func, max_seq_length, infinite, num_of_sequences, chars_per_token, dataset_num_proc, dataset_batch_size, neftune_noise_alpha, model_init_kwargs)
    169                     )
    170 
--> 171                     preprare_model_kwargs = {"use_gradient_checkpointing": args.gradient_checkpointing}
    172 
    173                     if _support_gc_kwargs:

AttributeError: 'NoneType' object has no attribute 'gradient_checkpointing'

Versions

transformers==4.35.2
trl==0.7.5.dev0 (install from source; I also tried from last release with same error).

Additional Info

If I pass my own TrainingArguments with gradient_checkpointing=True then all is fine.
Specifically, this line https://github.com/huggingface/trl/blob/main/trl/trainer/sft_trainer.py#L171 won't work if args is None.

The text was updated successfully, but these errors were encountered:

elliotttruestate · 2023-12-03T22:08:17Z

What is the specific source of that example code?

The example script here https://github.com/huggingface/trl/blob/main/examples/scripts/sft.py provides an expected input and code structure.

osanseviero · 2023-12-04T11:17:25Z

https://huggingface.co/docs/trl/sft_trainer#training-adapters-with-base-8-bit-models

lvwerra · 2023-12-05T09:51:46Z

cc @younesbelkada

lvwerra added 🏋 SFT Related to SFT ⚡ PEFT Related to PEFT labels Dec 5, 2023

younesbelkada mentioned this issue Dec 6, 2023

[SFTTrainer] Fix Trainer when args is None #1064

Merged

younesbelkada closed this as completed in #1064 Dec 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SFTTrainer using quantized model + peft does not work #1055

SFTTrainer using quantized model + peft does not work #1055

osanseviero commented Dec 3, 2023 •

edited

Loading

elliotttruestate commented Dec 3, 2023

osanseviero commented Dec 4, 2023

lvwerra commented Dec 5, 2023

SFTTrainer using quantized model + peft does not work #1055

SFTTrainer using quantized model + peft does not work #1055

Comments

osanseviero commented Dec 3, 2023 • edited Loading

Error logs

Versions

Additional Info

elliotttruestate commented Dec 3, 2023

osanseviero commented Dec 4, 2023

lvwerra commented Dec 5, 2023

osanseviero commented Dec 3, 2023 •

edited

Loading