Different ranges in the results graphs #991

Josephts1 · 2025-01-22T21:25:27Z

Search before asking

I have searched the HUB issues and discussions and found no similar questions.

Question

I'm looking for the best way to train a YOLO model for mandarin detection with my own dataset.
First, I tried with a pre-trained model (model=YOLO('yolo11s.pt')). I got fairly good results but there were a lot of peaks and troughs in the graphs (see image 1). The execution time was too long.
Second, I tried with the same pre-trained model but freezing the 24 layers of the YOLO model. I got better results (better mAP50-95 and better convergence) but there were still peaks and troughs (see image 2).
Finally, I trained a model from scratch (model=YOLO('yolo11s.yaml')) and got the best result so far (see image 3).

My question is: If you look at the vertical axis of the train/box_loss, val/box_loss, train/cls_loss, val/cls_loss, train/dfl_loss and val/dfl_loss graphs, they vary a lot between the pre-trained model and the new model. Does anyone know why these ranges are so different, one has values between 1 and 0, the other has values between 5 and 0?

(imagen 1)

(imagen 2)

(imagen 3)

Additional

No response

UltralyticsAssistant · 2025-01-22T21:25:50Z

👋 Hello @Josephts1, thank you for raising this issue about Ultralytics HUB 🚀! It looks like you’re working on training YOLO models for mandarin detection, and we appreciate you sharing your training details along with the results graphs 📊.

To help address your question, please take a look at our HUB Docs for guidance on best practices and potential insights:

Quickstart. Begin training and managing YOLO models with HUB in just a few steps.
Models: Training and Exporting. Learn how to effectively train YOLO models and export them for deployment.
Metrics. Understand loss functions (e.g., box_loss, cls_loss, dfl_loss) and performance metrics to interpret your training and evaluation results.

If you're observing varying loss graph ranges, this could depend on the model's architecture, training dynamics, or specific dataset characteristics (scale, variability, annotations). Please share additional context, such as:

The dataset details: How it's labeled and structured, along with any preprocessing steps.
The exact training configuration: Hyperparameters, optimizer settings, or any custom modifications made.
If there are specific commands or code snippets being used, sharing those could help us investigate further.

If this is a potential 🐛 Bug Report, please also include a minimum reproducible example (MRE) to assist us in reproducing the behavior.

Our engineering team will look into this further and get back to you soon. Your patience and detailed input are much appreciated, as they help us continue improving the HUB platform! 🚀😊

pderrenger · 2025-01-23T04:03:16Z

@Josephts1 thank you for your detailed explanation and the accompanying graphs! It’s great to see your experimentation with different training strategies for mandarin detection. The variation in loss ranges you observed is a common occurrence and can be explained by the following factors:

Pre-trained Model vs. Training from Scratch:
- Pre-trained models, like yolo11s.pt, are initialized with weights optimized on large datasets (e.g., COCO). These weights are already well-tuned, resulting in smaller initial losses. The loss values typically start closer to the optimal range (e.g., 1 to 0) when fine-tuning such models.
- On the other hand, when training from scratch (using yolo11s.yaml), the model starts with randomly initialized weights. The loss values are initially much higher (e.g., 5 or above) because the model has no prior knowledge and must learn everything from the ground up.
Freezing Layers:
- Freezing layers (e.g., the first 24 layers in your experiment) reduces the number of parameters being updated, often leading to smoother convergence. However, even with frozen layers, the pre-trained weights still influence the initial loss ranges, keeping them relatively low compared to training from scratch.
Dataset-Specific Characteristics:
- The variability in loss ranges can also depend on your dataset. Factors like class imbalance, annotation quality, and dataset size can lead to differences in how the model optimizes during training. For instance, if your dataset has very different characteristics from COCO, training from scratch might better align the model to your specific task, as seen in your results.
Loss Function Dynamics:
- The different loss components (e.g., box_loss, cls_loss, dfl_loss) have their own scales and dynamics based on the training strategy and the model's initialization. Pre-trained models might start closer to their optimal point, whereas training from scratch involves a broader exploration of the parameter space, leading to higher initial loss values.

Recommendations:

Choose the Best Strategy: Based on your observations, training from scratch seems to yield the best results for your dataset. This makes sense if your dataset (mandarin detection) is very different from the COCO dataset used to pre-train the model.
Monitor Class Distribution: Ensure your dataset has a balanced class distribution to avoid potential biases in loss calculations.
Use Validation Metrics: Focus on validation metrics like mAP50-95 to evaluate model performance, rather than the absolute loss values, as these are more indicative of real-world performance.

If you'd like to further analyze or adjust training behavior, consider visualizing additional metrics or leveraging tools provided in the Ultralytics HUB. The HUB allows for streamlined dataset management, training, and monitoring of results.

Let me know if you have further questions or need clarification! 😊

Josephts1 added the question Further information is requested label Jan 22, 2025

UltralyticsAssistant added the detect Object Detection issues, PR's label Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different ranges in the results graphs #991

Different ranges in the results graphs #991

Josephts1 commented Jan 22, 2025

UltralyticsAssistant commented Jan 22, 2025

pderrenger commented Jan 23, 2025

Different ranges in the results graphs #991

Different ranges in the results graphs #991

Comments

Josephts1 commented Jan 22, 2025

Search before asking

Question

Additional

UltralyticsAssistant commented Jan 22, 2025

pderrenger commented Jan 23, 2025

Recommendations: