Agricutural datasets for developing AI and robotics systems applied to agriculture
- PlantCLEF2022: Image-based plant identification at global scale - https://www.imageclef.org/plantclef2022
- Deep Learning for Non-Invasive Diagnosis of Nutrient Deficiencies in Sugar Beet Using RGB Images: https://zenodo.org/records/4106221#.YqdMcexBzon
- Weed25: A deep learning dataset for weed identification: https://doi.org/10.3389/fpls.2022.1053329
- A phenotyping weeds image dataset for open scientific research: https://zenodo.org/records/7598372
- SorghumWeedDataset_Classification and SorghumWeedDataset_Segmentation datasets for classification, detection, and segmentation in deep learning: https://doi.org/10.1016/j.dib.2023.109935
- Sugar Beets 2016: https://www.ipb.uni-bonn.de/data/sugarbeets2016/
- A Crop/Weed Field Image Dataset for the Evaluation of Computer Vision Based Precision Agriculture Tasks: 60 RGB annotated images - https://github.com/cwfid/dataset/tree/master
- RELLIS-3D: A Multi-modal Dataset for Off-Road Robotics: Semantic segmentation on 2D RGB images and 3D LiDAR pointclouds - https://github.com/unmannedlab/RELLIS-3D/tree/main
- RUGD Dataset: The RUGD dataset focuses on semantic understanding of unstructured outdoor environments for applications in off-road autonomous navigation. The datset is comprised of video sequences captured from the camera onboard a mobile robot platform - http://rugd.vision/
- GOOSE dataset: GOOSE is the German Outdoor and Offroad Dataset and is a 2D & 3D semantic segmentation dataset framework. In contrast to existing datasets like Cityscapes or BDD100K, the focus is on unstructured off-road environments - https://goose-dataset.de/docs/
- WildScenes: The WildScenes dataset is a multi-modal collection of traversals within Australian forests. The dataset is divided into five sequences across two forest locations. These sequences are both across different physical locations and across different times - https://csiro-robotics.github.io/WildScenes/
- BotanicGarden: A robot navigation dataset in a botanic garden of more than 48000m2. Comprehensive sensors are used, including Gray and RGB stereo cameras, spinning and MEMS 3D LiDARs, and low-cost and industrial-grade IMUs. An all-terrain wheeled robot is employed for data collection, traversing through thick woods, riversides, narrow trails, bridges, and grasslands. This yields 33 short and long sequences, forming 17.1km trajectories in total - https://github.com/robot-pesg/BotanicGarden
- Data synthesis methods for semantic segmentation in agriculture: A Capsicum annuum dataset: https://doi.org/10.1016/j.compag.2017.12.001
- Dataset on UAV RGB videos acquired over a vineyard including bunch labels for object detection and tracking: https://www.sciencedirect.com/science/article/pii/S2352340922010514
- ACFR Orchard Fruit Dataset: https://data.acfr.usyd.edu.au/ag/treecrops/2016-multifruit/
- An annotated visual dataset for Automatic weed detection and identification: https://zenodo.org/records/3906501
- GrapeMOTS: UAV vineyard dataset with MOTS grape bunch annotations recorded from multiple perspectives for enhanced object detection and tracking: https://doi.org/10.1016/j.dib.2024.110432
- CornWeed Dataset: A dataset for training maize and weed object detectors for agricultural machines: https://zenodo.org/records/7961764
- CitDet: A Benchmark Dataset for Citrus Fruit Detection: https://robotic-vision-lab.github.io/citdet/
- WeedCrop Image Dataset: It includes 2822 images annotated in YOLO v5 PyTorch format - https://www.kaggle.com/datasets/vinayakshanawad/weedcrop-image-dataset
- Embrapa Wine Grape Instance Segmentation Dataset – Embrapa WGISD: https://github.com/thsant/wgisd
- SorghumWeedDataset_Classification and SorghumWeedDataset_Segmentation datasets for classification, detection, and segmentation in deep learning: https://doi.org/10.1016/j.dib.2023.109935
- The ACRE Crop-Weed Dataset: https://zenodo.org/records/8102217
- ROSE Challenge dataset: Crop-weed dataset with images collected in different years by different robots - https://www.challenge-rose.fr/en/dataset-download/
- MinneApple: A Benchmark Dataset for Apple Detection and Segmentation: https://github.com/nicolaihaeni/MinneApple
- The CropAndWeed Dataset: A Multi-Modal Learning Approach for Efficient Crop and Weed Manipulation: 8k high-quality images and about 112k annotated plant instances. In addition to bounding boxes, segmentation masks and stem positions, annotations include a fine-grained classification into 16 crop and 58 weed species, as well as extensive meta-annotations of relevant environmental and recording parameters - https://github.com/cropandweed/cropandweed-dataset/tree/main
- Dataset on UAV RGB videos acquired over a vineyard including bunch labels for object detection and tracking: https://www.sciencedirect.com/science/article/pii/S2352340922010514
- GrapeMOTS: UAV vineyard dataset with MOTS grape bunch annotations recorded from multiple perspectives for enhanced object detection and tracking: https://doi.org/10.1016/j.dib.2024.110432
- CitrusFarm Dataset: CitrusFarm is a multimodal agricultural robotics dataset that provides both multispectral images and navigational sensor data for localization, mapping and crop monitoring tasks - https://ucr-robotics.github.io/Citrus-Farm-Dataset/
These are multimodal datasets encompassing data from different sensors like RGB, stereo, and RGB-D cameras, LiDARs, IMUs, GPS, thermal cameras, hyperspectral cameras, etc. Normally, they do not have labels.
- Sugar Beets 2016: https://www.ipb.uni-bonn.de/data/sugarbeets2016/
- CitrusFarm Dataset: CitrusFarm is a multimodal agricultural robotics dataset that provides both multispectral images and navigational sensor data for localization, mapping and crop monitoring tasks - https://ucr-robotics.github.io/Citrus-Farm-Dataset/
- A high-resolution, multimodal data set for agricultural robotics: A Ladybird's-eye view of Brassica: https://doi.org/10.1002/rob.21877
- RELLIS-3D: A Multi-modal Dataset for Off-Road Robotics: Semantic segmentation on 2D RGB images and 3D LiDAR pointclouds - https://github.com/unmannedlab/RELLIS-3D/tree/main
- RUGD Dataset: The RUGD dataset focuses on semantic understanding of unstructured outdoor environments for applications in off-road autonomous navigation. The datset is comprised of video sequences captured from the camera onboard a mobile robot platform. - http://rugd.vision/
- GOOSE dataset: GOOSE is the German Outdoor and Offroad Dataset and is a 2D & 3D semantic segmentation dataset framework. In contrast to existing datasets like Cityscapes or BDD100K, the focus is on unstructured off-road environments - https://goose-dataset.de/docs/
- WildScenes: The WildScenes dataset is a multi-modal collection of traversals within Australian forests. The dataset is divided into five sequences across two forest locations. These sequences are both across different physical locations and across different times - https://csiro-robotics.github.io/WildScenes/
- BotanicGarden: A robot navigation dataset in a botanic garden of more than 48000m2. Comprehensive sensors are used, including Gray and RGB stereo cameras, spinning and MEMS 3D LiDARs, and low-cost and industrial-grade IMUs. An all-terrain wheeled robot is employed for data collection, traversing through thick woods, riversides, narrow trails, bridges, and grasslands. This yields 33 short and long sequences, forming 17.1km trajectories in total - https://github.com/robot-pesg/BotanicGarden
- Quantitative Plant: Website that collects datasets for image classification, semantic segmentation and phenotyping - https://www.quantitative-plant.org/dataset
- A survey of public datasets for computer vision tasks in precision agriculture: Collection of datasets for detection and segmentation of weeds and fruits and phenotyping tasks (e.g., damage and disease detection, biomas prediction, yield estimation) - https://doi.org/10.1016/j.compag.2020.105760
- CropCraft: CropCraft is a python script that generates 3D models of crop fields, specialized in real-time simulation of robotics applications - https://github.com/Romea/cropcraft
- TomatoSynth: TomatoSynth provides realistic synthetic tomato plants training data for deep learning applications, reducing the need for manual annotation and allowing customization for specific greenhouse environments, thus advancing automation in agriculture - https://github.com/SCT-lab/TomatoSynth