Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how upload labels.txt #983

Closed
1 task done
aadog opened this issue Jan 15, 2025 · 9 comments
Closed
1 task done

how upload labels.txt #983

aadog opened this issue Jan 15, 2025 · 9 comments
Labels
HUB Ultralytics HUB issues question Further information is requested

Comments

@aadog
Copy link

aadog commented Jan 15, 2025

Search before asking

Question

Image

classify.zip

Additional

No response

@aadog aadog added the question Further information is requested label Jan 15, 2025
@UltralyticsAssistant UltralyticsAssistant added the HUB Ultralytics HUB issues label Jan 15, 2025
@UltralyticsAssistant
Copy link
Member

👋 Hello @aadog, thank you for raising an issue about Ultralytics HUB 🚀! Please visit our HUB Docs to learn more:

  • Quickstart. Start training and deploying YOLO models with HUB in seconds.
  • Datasets: Preparing and Uploading. Learn how to prepare and upload your datasets to HUB in YOLO format.
  • Projects: Creating and Managing. Group your models into projects for improved organization.
  • Models: Training and Exporting. Train YOLOv5 and YOLOv8 models on your custom datasets and export them to various formats for deployment.
  • Integrations. Explore different integration options for your trained models, such as TensorFlow, ONNX, OpenVINO, CoreML, and PaddlePaddle.
  • Ultralytics HUB App. Learn about the Ultralytics App for iOS and Android, which allows you to run models directly on your mobile device.
    • iOS. Learn about YOLO CoreML models accelerated on Apple's Neural Engine on iPhones and iPads.
    • Android. Explore TFLite acceleration on mobile devices.
  • Inference API. Understand how to use the Inference API for running your trained models in the cloud to generate predictions.

If this is a 🐛 Bug Report, please provide screenshots, details, and most importantly a Minimum Reproducible Example (MRE) to help us reproduce the issue and investigate further. For example:

  1. What you were attempting to do.
  2. Exact steps or code snippet used for uploading labels.txt.
  3. Any error messages or issues encountered during the process.

If this is a ❓ Question, consider sharing additional insights like:

  • Your dataset structure.
  • Relevant steps you followed.
  • Any specific challenges you're facing while uploading labels.txt.

We try to respond to all issues as promptly as possible. This is an automated response, but an Ultralytics engineer will follow up with you soon to offer further assistance. Thank you for your patience! 😊

@aadog
Copy link
Author

aadog commented Jan 16, 2025

I need to use labels.txt to create multiple categories.

@pderrenger
Copy link
Member

@aadog thank you for your question! To create multiple categories using a labels.txt file, you can follow these steps to integrate it into the Ultralytics workflow:

  1. Prepare the labels.txt file:
    Ensure your labels.txt file contains one class name per line. For example:

    person
    car
    dog
    cat
    ...
    

    Each line corresponds to a class ID, starting with 0 for the first line, 1 for the second, and so on.

  2. Integrate it into a Dataset YAML:
    The best practice is to reference these labels in the names section of your dataset YAML file. For example:

    path: ../datasets/my_dataset  # Dataset root directory
    train: images/train           # Train images (relative to `path`)
    val: images/val               # Validation images (relative to `path`)
    
    # Classes
    names:
      0: person
      1: car
      2: dog
      3: cat
      # Add more classes if needed

    If you have a labels.txt, you can programmatically read it and populate the names field in the YAML file.

  3. Upload and Use in Ultralytics HUB:

    • Structure your dataset: Follow the Ultralytics dataset structure, ensuring your labels (in YOLO format) are correctly paired with your images.
    • Zip the dataset: Place the dataset YAML file in the root directory of your dataset and zip it.
    • Upload to Ultralytics HUB: Navigate to the Datasets page on Ultralytics HUB, click Upload Dataset, and select your zipped file. HUB will process the dataset and validate it.
  4. Validate Your Dataset:
    Before uploading, you can validate the dataset locally using the following command:

    from ultralytics.hub import check_dataset
    check_dataset("path/to/your_dataset.zip", task="detect")

Once uploaded, you can train a model using your dataset with these categories directly on Ultralytics HUB. If you encounter any issues, feel free to ask for further assistance! 😊

For more details, you can explore the datasets documentation.

@aadog
Copy link
Author

aadog commented Jan 21, 2025

When I have a large data set, such as a 5-digit verification code, I hope to use labels.txt to mark out the pictures and categories instead of marking them in the name, because he has too many categories

@aadog
Copy link
Author

aadog commented Jan 21, 2025

@aadog感谢您的提问!要使用labels.txt文件创建多个类别,您可以按照以下步骤将其集成到 Ultralytics 工作流程中:

  1. 准备labels.txt文件
    确保labels.txt文件每行包含一个类名。例如:

    person
    car
    dog
    cat
    ...
    

    每一行对应一个类 ID,0第一行以 开始,1第二行以 开始,依此类推。

  2. 将其集成到数据集 YAML 中
    最佳做法是在数据集 YAML 文件的部分中引用这些标签names。例如:
    path: ../datasets/my_dataset # Dataset root directory
    train: images/train # Train images (relative to path)
    val: images/val # Validation images (relative to path)

    Classes

    names:
    0: person
    1: car
    2: dog
    3: cat

    Add more classes if needed

    如果您有labels.txt,您可以以编程方式读取它并填充namesYAML 文件中的字段。

  3. 在 Ultralytics HUB 中上传和使用

    • 构建您的数据集:遵循Ultralytics 数据集结构,确保您的标签(YOLO 格式)与您的图像正确配对。
    • 压缩数据集:将数据集 YAML 文件放在数据集的根目录中并压缩。
    • 上传至 Ultralytics HUB:导航至Ultralytics HUB 上的数据集页面,单击**“上传数据集”**,然后选择您的压缩文件。HUB 将处理数据集并对其进行验证。
  4. 验证您的数据集
    上传之前,您可以使用以下命令在本地验证数据集:
    from ultralytics.hub import check_dataset
    check_dataset("path/to/your_dataset.zip", task="detect")

上传后,您可以直接在 Ultralytics HUB 上使用具有这些类别的数据集训练模型。如果您遇到任何问题,请随时寻求进一步的帮助!😊

有关更多详细信息,您可以浏览数据集文档

When I have a large data set, such as a 5-digit verification code, I hope to use labels.txt to mark out the pictures and categories instead of marking them in the name, because he has too many categories

@sergiuwaxmann
Copy link
Member

@aadog You need to follow the Ultralytics dataset structure, ensuring your labels (in YOLO format) are correctly paired with your images.
Ultralytics HUB will not understand any other format at the moment.

@aadog
Copy link
Author

aadog commented Jan 21, 2025

@aadog您需要遵循Ultralytics 数据集结构,确保您的标签(YOLO 格式)与您的图像正确配对。Ultralytics HUB 目前无法理解任何其他格式。

Do I have to declare all categories in yaml?

@aadog
Copy link
Author

aadog commented Jan 21, 2025

@aadog You need to follow the Ultralytics dataset structure, ensuring your labels (in YOLO format) are correctly paired with your images. Ultralytics HUB will not understand any other format at the moment.

What should I do if he has hundreds of thousands of verification code combinations?

@pderrenger
Copy link
Member

Thanks for clarifying, @aadog! When dealing with a dataset containing hundreds of thousands of categories (such as verification code combinations), it can become impractical to manually declare all categories in the names section of the dataset YAML file. Let me guide you on how you can manage this efficiently:

1. Dynamic Category Loading from labels.txt

Instead of manually listing all categories in the YAML file, you can dynamically populate the names field from a labels.txt file programmatically. For example:

# Create dataset YAML dynamically
labels_file = "path/to/labels.txt"  # Your labels.txt file
yaml_file = "path/to/dataset.yaml"  # Your dataset.yaml file

# Read labels from labels.txt
with open(labels_file, "r") as f:
    labels = f.read().strip().split("\n")

# Write to dataset.yaml
data_yaml = f"""
path: ../datasets/my_dataset  # Dataset root directory
train: images/train           # Train images (relative to `path`)
val: images/val               # Validation images (relative to `path`)

# Classes
names:
"""
for i, label in enumerate(labels):
    data_yaml += f"  {i}: {label}\n"

with open(yaml_file, "w") as f:
    f.write(data_yaml)

print(f"Dataset YAML saved to {yaml_file}")

This script reads your labels.txt file and converts its contents into the names section of the YAML file. Run this script to generate the YAML dynamically.


2. Dataset Structure

Ensure your dataset still follows the YOLO format:

  • Images are stored in train/images/ and val/images/ directories.
  • Corresponding YOLO .txt label files (for bounding boxes or categories) are stored in train/labels/ and val/labels/.

3. Efficient Training with Large Categories

When training on such a high number of categories, consider these strategies:

  • Preprocessing: Clean your dataset to remove duplicate or unnecessary combinations to reduce the number of categories.
  • Hardware: Use high-memory GPUs to handle large label spaces effectively.
  • Cloud Training: Leverage Ultralytics HUB Cloud Training for scalable and efficient training.

4. Upload to Ultralytics HUB

Once structured, zip your dataset and ensure the YAML file resides in the root directory. You can then upload it to Ultralytics HUB for training.


If there’s anything specific you’re stuck on (e.g., managing large .txt files, memory issues during training), feel free to ask! 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
HUB Ultralytics HUB issues question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants