This repository is created for open-source LAION-Beyond dataset. The benchmark consists of two parts: OOV (Out of Vocabulary) and IV (In Vocabulary). The OOV image-text pairs count is 106,052, while the IV image count is 51,330 (IV data does not provide captions and is only used for evaluation). The LAION-Beyond dataset consists of 9 domains: Plants Fungi, Insects Spiders, Animals, Pokemon, FolkArt, Landmark, Attire, Food, and Architecture.
Our pre-print paper and LAION-Beyond dataset will be released soon.
- 05.12.2023: Create the LAION-Beyond repo.