You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2475 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2475 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|2024-12-15 22:34:55] modeling_utils.py:1670 >> Instantiating Qwen2VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
[WARNING|2024-12-15 22:34:55] logging.py:168 >> Qwen2VLRotaryEmbedding can now be fully parameterized by passing the model config through the config argument. All other arguments will be removed in v4.46
[INFO|2024-12-15 22:35:00] modeling_utils.py:4800 >> All model checkpoint weights were used when initializing Qwen2VLForConditionalGeneration.
[INFO|2024-12-15 22:35:00] modeling_utils.py:4808 >> All the weights of Qwen2VLForConditionalGeneration were initialized from the model checkpoint at /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f. If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2VLForConditionalGeneration for predictions without further training.
[INFO|2024-12-15 22:49:41] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/tokenizer_config.json
[INFO|2024-12-15 22:49:41] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/special_tokens_map.json
[INFO|2024-12-15 22:49:41] image_processing_base.py:258 >> Image processor saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/preprocessor_config.json
[INFO|2024-12-15 22:49:41] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/tokenizer_config.json
[INFO|2024-12-15 22:49:41] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/special_tokens_map.json
[INFO|2024-12-15 22:49:42] processing_utils.py:541 >> chat template saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/chat_template.json
[INFO|2024-12-15 22:49:42] trainer.py:2584 >>
Training completed. Do not forget to share your model on huggingface.co/models =)
[INFO|2024-12-15 22:49:42] image_processing_base.py:258 >> Image processor saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/preprocessor_config.json
[INFO|2024-12-15 22:49:42] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/tokenizer_config.json
[INFO|2024-12-15 22:49:42] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/special_tokens_map.json
[INFO|2024-12-15 22:49:42] processing_utils.py:541 >> chat template saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/chat_template.json
[INFO|2024-12-15 22:49:42] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22
[INFO|2024-12-15 22:49:42] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/tokenizer_config.json
[INFO|2024-12-15 22:49:42] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/special_tokens_map.json
[WARNING|2024-12-15 22:49:42] logging.py:162 >> No metric eval_loss to plot.
[WARNING|2024-12-15 22:49:42] logging.py:162 >> No metric eval_accuracy to plot.
[INFO|2024-12-15 22:49:51] modelcard.py:449 >> Dropping the following result as it does not have all the necessary fields: {'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}}
加载模型时候的日志:
[INFO|modeling_utils.py:4808] 2024-12-15 22:51:50,813 >> All the weights of Qwen2VLForConditionalGeneration were initialized from the model checkpoint at /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2VLForConditionalGeneration for predictions without further training.
[INFO|configuration_utils.py:1049] 2024-12-15 22:51:50,817 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/generation_config.json
[INFO|configuration_utils.py:1096] 2024-12-15 22:51:50,817 >> Generate config GenerationConfig {
"bos_token_id": 151643,
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"temperature": 0.01,
"top_k": 1,
"top_p": 0.001
}
[INFO|2024-12-15 22:51:50] llamafactory.model.model_utils.attention:157 >> Using torch SDPA for faster training and inference.
[INFO|2024-12-15 22:51:51] llamafactory.model.adapter:157 >> Merged 1 adapter(s).
[INFO|2024-12-15 22:51:51] llamafactory.model.adapter:157 >> Loaded adapter(s): saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22
[INFO|2024-12-15 22:51:51] llamafactory.model.loader:157 >> all params: 8,291,375,616
不知为何训练数据足够但是效果差几乎没有起到任何影响,求大佬指教!
System Info
llamafactory
version: 0.9.1.dev0以下是我的一些参数和数据集内容
以下是我的训练loss图
训练脚本:
日志:```
训练完毕。
[INFO|2024-12-15 22:34:50] parser.py:355 >> Process rank: 0, device: cuda:0, n_gpu: 1, distributed training: False, compute dtype: torch.bfloat16
[INFO|2024-12-15 22:34:50] configuration_utils.py:677 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/config.json
[INFO|2024-12-15 22:34:50] configuration_utils.py:746 >> Model config Qwen2VLConfig { "_name_or_path": "/root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f", "architectures": [ "Qwen2VLForConditionalGeneration" ], "attention_dropout": 0.0, "bos_token_id": 151643, "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 3584, "image_token_id": 151655, "initializer_range": 0.02, "intermediate_size": 18944, "max_position_embeddings": 32768, "max_window_layers": 28, "model_type": "qwen2_vl", "num_attention_heads": 28, "num_hidden_layers": 28, "num_key_value_heads": 4, "rms_norm_eps": 1e-06, "rope_scaling": { "mrope_section": [ 16, 24, 24 ], "rope_type": "default", "type": "default" }, "rope_theta": 1000000.0, "sliding_window": 32768, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.46.1", "use_cache": true, "use_sliding_window": false, "video_token_id": 151656, "vision_config": { "in_chans": 3, "model_type": "qwen2_vl", "spatial_patch_size": 14 }, "vision_end_token_id": 151653, "vision_start_token_id": 151652, "vision_token_id": 151654, "vocab_size": 152064 }
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file vocab.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file merges.txt
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file tokenizer.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file added_tokens.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file special_tokens_map.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file tokenizer_config.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2475 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|2024-12-15 22:34:50] image_processing_base.py:373 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/preprocessor_config.json
[INFO|2024-12-15 22:34:50] image_processing_base.py:373 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/preprocessor_config.json
[INFO|2024-12-15 22:34:50] image_processing_base.py:429 >> Image processor Qwen2VLImageProcessor { "do_convert_rgb": true, "do_normalize": true, "do_rescale": true, "do_resize": true, "image_mean": [ 0.48145466, 0.4578275, 0.40821073 ], "image_processor_type": "Qwen2VLImageProcessor", "image_std": [ 0.26862954, 0.26130258, 0.27577711 ], "max_pixels": 12845056, "merge_size": 2, "min_pixels": 3136, "patch_size": 14, "processor_class": "Qwen2VLProcessor", "resample": 3, "rescale_factor": 0.00392156862745098, "size": { "max_pixels": 12845056, "min_pixels": 3136 }, "temporal_patch_size": 2 }
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file vocab.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file merges.txt
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file tokenizer.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file added_tokens.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file special_tokens_map.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file tokenizer_config.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2475 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|2024-12-15 22:34:51] processing_utils.py:755 >> Processor Qwen2VLProcessor:
image_processor: Qwen2VLImageProcessor { "do_convert_rgb": true, "do_normalize": true, "do_rescale": true, "do_resize": true, "image_mean": [ 0.48145466, 0.4578275, 0.40821073 ], "image_processor_type": "Qwen2VLImageProcessor", "image_std": [ 0.26862954, 0.26130258, 0.27577711 ], "max_pixels": 12845056, "merge_size": 2, "min_pixels": 3136, "patch_size": 14, "processor_class": "Qwen2VLProcessor", "resample": 3, "rescale_factor": 0.00392156862745098, "size": { "max_pixels": 12845056, "min_pixels": 3136 }, "temporal_patch_size": 2 }
tokenizer: Qwen2TokenizerFast(name_or_path='/root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f', vocab_size=151643, model_max_length=32768, is_fast=True, padding_side='left', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False), added_tokens_decoder={ 151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
}
{ "processor_class": "Qwen2VLProcessor" }
[INFO|2024-12-15 22:34:51] logging.py:157 >> Replace eos token: <|im_end|>
[INFO|2024-12-15 22:34:51] logging.py:157 >> Loading dataset data_no_history.json...
[INFO|2024-12-15 22:34:55] configuration_utils.py:677 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/config.json
[INFO|2024-12-15 22:34:55] configuration_utils.py:746 >> Model config Qwen2VLConfig { "_name_or_path": "/root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f", "architectures": [ "Qwen2VLForConditionalGeneration" ], "attention_dropout": 0.0, "bos_token_id": 151643, "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 3584, "image_token_id": 151655, "initializer_range": 0.02, "intermediate_size": 18944, "max_position_embeddings": 32768, "max_window_layers": 28, "model_type": "qwen2_vl", "num_attention_heads": 28, "num_hidden_layers": 28, "num_key_value_heads": 4, "rms_norm_eps": 1e-06, "rope_scaling": { "mrope_section": [ 16, 24, 24 ], "rope_type": "default", "type": "default" }, "rope_theta": 1000000.0, "sliding_window": 32768, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.46.1", "use_cache": true, "use_sliding_window": false, "video_token_id": 151656, "vision_config": { "in_chans": 3, "model_type": "qwen2_vl", "spatial_patch_size": 14 }, "vision_end_token_id": 151653, "vision_start_token_id": 151652, "vision_token_id": 151654, "vocab_size": 152064 }
[INFO|2024-12-15 22:34:55] modeling_utils.py:3934 >> loading weights file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/model.safetensors.index.json
[INFO|2024-12-15 22:34:55] modeling_utils.py:1670 >> Instantiating Qwen2VLForConditionalGeneration model under default dtype torch.bfloat16.
[INFO|2024-12-15 22:34:55] configuration_utils.py:1096 >> Generate config GenerationConfig { "bos_token_id": 151643, "eos_token_id": 151645 }
[INFO|2024-12-15 22:34:55] modeling_utils.py:1670 >> Instantiating Qwen2VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
[WARNING|2024-12-15 22:34:55] logging.py:168 >> Qwen2VLRotaryEmbedding can now be fully parameterized by passing the model config through the config argument. All other arguments will be removed in v4.46
[INFO|2024-12-15 22:35:00] modeling_utils.py:4800 >> All model checkpoint weights were used when initializing Qwen2VLForConditionalGeneration.
[INFO|2024-12-15 22:35:00] modeling_utils.py:4808 >> All the weights of Qwen2VLForConditionalGeneration were initialized from the model checkpoint at /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f. If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2VLForConditionalGeneration for predictions without further training.
[INFO|2024-12-15 22:35:00] configuration_utils.py:1049 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/generation_config.json
[INFO|2024-12-15 22:35:00] configuration_utils.py:1096 >> Generate config GenerationConfig { "bos_token_id": 151643, "do_sample": true, "eos_token_id": [ 151645, 151643 ], "pad_token_id": 151643, "temperature": 0.01, "top_k": 1, "top_p": 0.001 }
[INFO|2024-12-15 22:35:00] logging.py:157 >> Gradient checkpointing enabled.
[INFO|2024-12-15 22:35:00] logging.py:157 >> Using torch SDPA for faster training and inference.
[INFO|2024-12-15 22:35:00] logging.py:157 >> Upcasting trainable params to float32.
[INFO|2024-12-15 22:35:00] logging.py:157 >> Fine-tuning method: LoRA
[INFO|2024-12-15 22:35:00] logging.py:157 >> Found linear modules: gate_proj,up_proj,q_proj,v_proj,o_proj,down_proj,k_proj
[INFO|2024-12-15 22:35:01] logging.py:157 >> trainable params: 20,185,088 || all params: 8,311,560,704 || trainable%: 0.2429
[INFO|2024-12-15 22:35:01] trainer.py:698 >> Using auto half precision backend
[INFO|2024-12-15 22:35:01] trainer.py:2313 >> ***** Running training *****
[INFO|2024-12-15 22:35:01] trainer.py:2314 >> Num examples = 450
[INFO|2024-12-15 22:35:01] trainer.py:2315 >> Num Epochs = 3
[INFO|2024-12-15 22:35:01] trainer.py:2316 >> Instantaneous batch size per device = 2
[INFO|2024-12-15 22:35:01] trainer.py:2319 >> Total train batch size (w. parallel, distributed & accumulation) = 16
[INFO|2024-12-15 22:35:01] trainer.py:2320 >> Gradient Accumulation steps = 8
[INFO|2024-12-15 22:35:01] trainer.py:2321 >> Total optimization steps = 84
[INFO|2024-12-15 22:35:01] trainer.py:2322 >> Number of trainable parameters = 20,185,088
[INFO|2024-12-15 22:35:58] logging.py:157 >> {'loss': 0.6811, 'learning_rate': 4.9564e-05, 'epoch': 0.18}
[INFO|2024-12-15 22:36:49] logging.py:157 >> {'loss': 0.6575, 'learning_rate': 4.8272e-05, 'epoch': 0.36}
[INFO|2024-12-15 22:37:45] logging.py:157 >> {'loss': 0.6249, 'learning_rate': 4.6168e-05, 'epoch': 0.53}
[INFO|2024-12-15 22:38:37] logging.py:157 >> {'loss': 0.5894, 'learning_rate': 4.3326e-05, 'epoch': 0.71}
[INFO|2024-12-15 22:39:30] logging.py:157 >> {'loss': 0.6103, 'learning_rate': 3.9846e-05, 'epoch': 0.89}
[INFO|2024-12-15 22:40:16] logging.py:157 >> {'loss': 0.5776, 'learning_rate': 3.5847e-05, 'epoch': 1.07}
[INFO|2024-12-15 22:41:05] logging.py:157 >> {'loss': 0.5526, 'learning_rate': 3.1470e-05, 'epoch': 1.24}
[INFO|2024-12-15 22:41:55] logging.py:157 >> {'loss': 0.5321, 'learning_rate': 2.6868e-05, 'epoch': 1.42}
[INFO|2024-12-15 22:42:51] logging.py:157 >> {'loss': 0.5318, 'learning_rate': 2.2201e-05, 'epoch': 1.60}
[INFO|2024-12-15 22:43:43] logging.py:157 >> {'loss': 0.5415, 'learning_rate': 1.7631e-05, 'epoch': 1.78}
[INFO|2024-12-15 22:44:39] logging.py:157 >> {'loss': 0.5456, 'learning_rate': 1.3318e-05, 'epoch': 1.96}
[INFO|2024-12-15 22:45:35] logging.py:157 >> {'loss': 0.5419, 'learning_rate': 9.4128e-06, 'epoch': 2.13}
[INFO|2024-12-15 22:46:25] logging.py:157 >> {'loss': 0.5615, 'learning_rate': 6.0507e-06, 'epoch': 2.31}
[INFO|2024-12-15 22:47:09] logging.py:157 >> {'loss': 0.5684, 'learning_rate': 3.3494e-06, 'epoch': 2.49}
[INFO|2024-12-15 22:48:09] logging.py:157 >> {'loss': 0.4966, 'learning_rate': 1.4029e-06, 'epoch': 2.67}
[INFO|2024-12-15 22:48:58] logging.py:157 >> {'loss': 0.5051, 'learning_rate': 2.7923e-07, 'epoch': 2.84}
[INFO|2024-12-15 22:49:40] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84
[INFO|2024-12-15 22:49:40] configuration_utils.py:677 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/config.json
[INFO|2024-12-15 22:49:40] configuration_utils.py:746 >> Model config Qwen2VLConfig { "architectures": [ "Qwen2VLForConditionalGeneration" ], "attention_dropout": 0.0, "bos_token_id": 151643, "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 3584, "image_token_id": 151655, "initializer_range": 0.02, "intermediate_size": 18944, "max_position_embeddings": 32768, "max_window_layers": 28, "model_type": "qwen2_vl", "num_attention_heads": 28, "num_hidden_layers": 28, "num_key_value_heads": 4, "rms_norm_eps": 1e-06, "rope_scaling": { "mrope_section": [ 16, 24, 24 ], "rope_type": "default", "type": "default" }, "rope_theta": 1000000.0, "sliding_window": 32768, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.46.1", "use_cache": true, "use_sliding_window": false, "video_token_id": 151656, "vision_config": { "in_chans": 3, "model_type": "qwen2_vl", "spatial_patch_size": 14 }, "vision_end_token_id": 151653, "vision_start_token_id": 151652, "vision_token_id": 151654, "vocab_size": 152064 }
[INFO|2024-12-15 22:49:41] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/tokenizer_config.json
[INFO|2024-12-15 22:49:41] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/special_tokens_map.json
[INFO|2024-12-15 22:49:41] image_processing_base.py:258 >> Image processor saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/preprocessor_config.json
[INFO|2024-12-15 22:49:41] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/tokenizer_config.json
[INFO|2024-12-15 22:49:41] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/special_tokens_map.json
[INFO|2024-12-15 22:49:42] processing_utils.py:541 >> chat template saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/chat_template.json
[INFO|2024-12-15 22:49:42] trainer.py:2584 >>
Training completed. Do not forget to share your model on huggingface.co/models =)
[INFO|2024-12-15 22:49:42] image_processing_base.py:258 >> Image processor saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/preprocessor_config.json
[INFO|2024-12-15 22:49:42] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/tokenizer_config.json
[INFO|2024-12-15 22:49:42] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/special_tokens_map.json
[INFO|2024-12-15 22:49:42] processing_utils.py:541 >> chat template saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/chat_template.json
[INFO|2024-12-15 22:49:42] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22
[INFO|2024-12-15 22:49:42] configuration_utils.py:677 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/config.json
[INFO|2024-12-15 22:49:42] configuration_utils.py:746 >> Model config Qwen2VLConfig { "architectures": [ "Qwen2VLForConditionalGeneration" ], "attention_dropout": 0.0, "bos_token_id": 151643, "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 3584, "image_token_id": 151655, "initializer_range": 0.02, "intermediate_size": 18944, "max_position_embeddings": 32768, "max_window_layers": 28, "model_type": "qwen2_vl", "num_attention_heads": 28, "num_hidden_layers": 28, "num_key_value_heads": 4, "rms_norm_eps": 1e-06, "rope_scaling": { "mrope_section": [ 16, 24, 24 ], "rope_type": "default", "type": "default" }, "rope_theta": 1000000.0, "sliding_window": 32768, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.46.1", "use_cache": true, "use_sliding_window": false, "video_token_id": 151656, "vision_config": { "in_chans": 3, "model_type": "qwen2_vl", "spatial_patch_size": 14 }, "vision_end_token_id": 151653, "vision_start_token_id": 151652, "vision_token_id": 151654, "vocab_size": 152064 }
[INFO|2024-12-15 22:49:42] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/tokenizer_config.json
[INFO|2024-12-15 22:49:42] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/special_tokens_map.json
[WARNING|2024-12-15 22:49:42] logging.py:162 >> No metric eval_loss to plot.
[WARNING|2024-12-15 22:49:42] logging.py:162 >> No metric eval_accuracy to plot.
[INFO|2024-12-15 22:49:42] trainer.py:4117 >> ***** Running Evaluation *****
[INFO|2024-12-15 22:49:42] trainer.py:4119 >> Num examples = 50
[INFO|2024-12-15 22:49:42] trainer.py:4122 >> Batch size = 2
[INFO|2024-12-15 22:49:51] modelcard.py:449 >> Dropping the following result as it does not have all the necessary fields: {'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}}
[INFO|modeling_utils.py:4808] 2024-12-15 22:51:50,813 >> All the weights of Qwen2VLForConditionalGeneration were initialized from the model checkpoint at /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2VLForConditionalGeneration for predictions without further training.
[INFO|configuration_utils.py:1049] 2024-12-15 22:51:50,817 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/generation_config.json
[INFO|configuration_utils.py:1096] 2024-12-15 22:51:50,817 >> Generate config GenerationConfig {
"bos_token_id": 151643,
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"temperature": 0.01,
"top_k": 1,
"top_p": 0.001
}
[INFO|2024-12-15 22:51:50] llamafactory.model.model_utils.attention:157 >> Using torch SDPA for faster training and inference.
[INFO|2024-12-15 22:51:51] llamafactory.model.adapter:157 >> Merged 1 adapter(s).
[INFO|2024-12-15 22:51:51] llamafactory.model.adapter:157 >> Loaded adapter(s): saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22
[INFO|2024-12-15 22:51:51] llamafactory.model.loader:157 >> all params: 8,291,375,616
我确定勾选了训练好的检查点路径,使用huggingface方式加载了模型和检查点,但是效果和没训练一样。我又试了
identity.json
,并勾选了训练好的检查点路径,使用huggingface方式加载了模型和检查点。但结果还是一样的,模型依旧回答自己是千问大模型。自带的identity训练效果(仅供测试就没做改动):
请问大佬,这种是什么问题呢?是数据集问题还是参数问题,抑或是我的打开方式不正确?小白求教🙏
Originally posted by @Evi233 in #6339
已经参考过的issues
#690
#4283
#1994
#882
#837
#331
The text was updated successfully, but these errors were encountered: