Unusual behavior of performance metrics #988

Josephts1 · 2025-01-22T13:20:07Z

Search before asking

I have searched the HUB issues and discussions and found no similar questions.

Question

I recently created a new discussion exposing the behavior of the performance graphs thrown by model.train (see image 1, see large decrease in the first epochs). I was advised to increase my dataset and review the quality of my labels. I did and increased the number of images to 256. I was also more detailed with the labels (see image 2) and this is what the predictions from model.train look like (see image 3). Even with all this, I do not understand the behavior of the mAP50 and mAP50-95 performance metrics. Could you explain to me why this is and if possible, provide me with a solution?
I appreciate your help

I attach the code used

from google.colab import
drive.mount('/content/drive')

import yaml

data={ 'path': '/content/drive/MyDrive/Proyecto_de_grado/data', 'train': 'images/train', 'val': 'images/val', 'names': { 0: 'fruta' } }

with open('/content/drive/MyDrive/Proyecto_de_grado/data/data.yaml', 'w') as file:
yaml.dump(data, file,default_flow_style=False,sort_keys=False)

pip install -U ultralytics

from ultralytics import YOLO

model=YOLO('yolo11s.pt')

Frez_layers=24
freeze = [f"model.{x}." for x in range(0,Frez_layers)]
print(freeze)
frozen_params={}
for k, v in model.named_parameters():
#print(k)
v.requires_grad = True # train all layers
frozen_params[k] = v.data.clone()
if any(x in k for x in freeze):
print(f"freezing {k}")
v.requires_grad = False

result=model.train(data="/content/drive/MyDrive/Proyecto_de_grado/data/data.yaml", epochs=100,patience=50,batch=8,plots=True,optimizer="auto",lr0=1e- 4,seed=42,project="/content/drive/MyDrive/Proyecto_de_grado/runs/freeze_layers/todo_congelado_11s")

Additional

No response

The text was updated successfully, but these errors were encountered:

UltralyticsAssistant · 2025-01-22T13:20:34Z

👋 Hello @Josephts1, thank you for raising this issue about Ultralytics HUB 🚀! We appreciate the detailed explanation and inclusion of code and images—it’s very helpful! Please visit our HUB Docs to explore additional resources that might provide some insights:

Quickstart. Learn how to start training and deploying YOLO models with HUB quickly.
Datasets: Preparing and Uploading. Ensure your dataset preparation aligns with YOLO format best practices.
Models: Training and Exporting. Useful tips to optimize your model training and troubleshoot unexpected behaviors.
Inference API. Gain a better understanding of deploying models in different environments for evaluations.

Regarding the unusual behavior you’ve noticed in your performance metrics 🧐:

If this is potentially a 🐛 bug, could you please provide a minimum reproducible example (MRE) by simplifying the provided code? Refer to our MRE guide for details.
For your current results, it would also help if you could share these additional details:
1. The YOLO version you’re using (pip show ultralytics to check).
2. Device/environment details such as hardware (e.g., GPU/CPU), operating system, and Python version.
3. A clearer explanation of the specific metric behavior you find unusual, e.g., mAP50 vs. mAP50-95 patterns.

Finally, to facilitate debugging, you might want to consult the Projects section for tips on managing and analyzing results effectively.

This is an automated response 🤖, but rest assured that an Ultralytics engineer will look into your issue shortly. Thank you for your patience as we work to assist you! 🙏

pderrenger · 2025-01-22T17:23:47Z

@Josephts1 thank you for providing a detailed explanation and code! Let's break this down to better understand the behavior you're observing with your performance metrics (mAP50 and mAP50-95) and troubleshoot effectively.

Why the Metrics Behave This Way

Initial mAP Drop: It's common to see a large drop in the metrics in the initial epochs, especially after freezing layers. This happens because:
- The frozen layers prevent the backbone from learning new features, relying solely on the frozen weights.
- Only the unfrozen layers (likely the head or classifier) are being trained, which can take time to adapt to your dataset, especially if it's small or has unique characteristics.
- The optimizer needs time to adjust to the new gradients due to freezing.
Dataset Size and Quality: While you increased your dataset size to 256 images and improved label quality, this might still be insufficient for robust training, particularly for complex models like YOLO. A small dataset can lead to overfitting or underfitting, evident in inconsistent metric behavior. Additionally:
- Dataset Diversity: If the dataset lacks diversity (e.g., similar backgrounds, lighting, or object variations), the model might struggle to generalize well.
- Label Noise: Even small inaccuracies in labeling can significantly impact metrics like mAP50-95.
Learning Rate and Freezing Layers: Freezing 24 layers and using a learning rate of 1e-4 could lead to slower convergence as the model primarily learns from the head. If the frozen layers are overly restrictive, this might explain the fluctuating behavior in metrics.

Recommendations to Improve the Results

Here are some steps to refine your training process and potentially stabilize the metrics:

1. Adjust Freezing Strategy

Instead of freezing 24 layers, consider freezing fewer layers to allow more of the backbone to adapt to your dataset. For instance:
```
freeze = [f"model.{x}." for x in range(0, 10)]  # Freeze fewer layers
```
Experiment with different freezing configurations to find a balance that works best for your dataset.

2. Increase Dataset Size and Diversity

Aim for a higher dataset size (e.g., >500 images) to provide the model with more examples to learn from.
Ensure diverse scenarios, including different object orientations, sizes, lighting, and backgrounds, are represented in your dataset.

3. Adjust Hyperparameters

Learning Rate: Start with a slightly higher learning rate (e.g., lr0=1e-3) and monitor the behavior. YOLO models generally perform well with a learning rate warmup phase.
Batch Size: If resources allow, increase the batch size (e.g., batch=16) for more stable gradient updates.

4. Use Validation Metrics

It’s important to validate your model on a held-out validation set rather than just relying on training metrics. Use the model.val() method after training to assess the actual performance:
```
val_results = model.val(data="/content/drive/MyDrive/Proyecto_de_grado/data/data.yaml")
print(val_results.box.map)  # mAP50-95
```

5. Enable Advanced Augmentations

Leverage data augmentation to improve model robustness. YOLO supports augmentations like mosaic, mixup, and more. Add these settings in your YAML file or via the training command:

result = model.train(
    data="/content/drive/MyDrive/Proyecto_de_grado/data/data.yaml",
    epochs=100,
    patience=50,
    batch=8,
    optimizer="auto",
    lr0=1e-4,
    seed=42,
    augment=True,  # Enable augmentations
    project="/content/drive/MyDrive/Proyecto_de_grado/runs/freeze_layers/todo_congelado_11s"
)

6. Monitor Training Logs

Inspect the training plots and logs for unusual trends (e.g., loss spikes or divergence). These can provide clues about potential issues such as learning rate instability or overfitting.

Additional Resources

For more insights on training and validation with YOLO, refer to:

Ultralytics HUB Quickstart
Val Mode Documentation, which explains validation metrics like mAP50 and mAP50-95 in detail.

Summary

Your metrics' unusual behavior is likely due to a combination of dataset size/quality, freezing strategy, and hyperparameter choices. By fine-tuning these aspects, you should see improved stability and performance in your metrics. Let us know how it goes or if you encounter further issues—happy to help! 😊

Josephts1 added the question Further information is requested label Jan 22, 2025

UltralyticsAssistant added the detect Object Detection issues, PR's label Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unusual behavior of performance metrics #988

Unusual behavior of performance metrics #988

Josephts1 commented Jan 22, 2025

UltralyticsAssistant commented Jan 22, 2025

pderrenger commented Jan 22, 2025

Unusual behavior of performance metrics #988

Unusual behavior of performance metrics #988

Comments

Josephts1 commented Jan 22, 2025

Search before asking

Question

Additional

UltralyticsAssistant commented Jan 22, 2025

pderrenger commented Jan 22, 2025

Why the Metrics Behave This Way

Recommendations to Improve the Results

1. Adjust Freezing Strategy

2. Increase Dataset Size and Diversity

3. Adjust Hyperparameters

4. Use Validation Metrics

5. Enable Advanced Augmentations

6. Monitor Training Logs

Additional Resources

Summary