auto_labeler

A library to automatically label any computer vision dataset at zero/near-zero manual labeling cost.

TODO

Add config driven prompting for VLMs
Enable visualization for LoFTR
Support for SuperGlue and other classical CV Feature matching algorithms such as SIFT, SURF etc.

Introduction

auto_labeler is a simple, easy-to-use framework which helps you generate high quality pseudo-labels for any given computer vision task at hand. This library abstracts away all the various SOTA algorithms in play for Computer Vision, their modes of usage(like** image based retrieval, text to image retrieval, feature matching for classification, zero shot or single shot promptable detection or instance segmentation or exploiting LLMs**) to auto label any vision dataset.

Features:

1. High Abstraction: auto_labeler provides inbuilt wrappers over exisiting widely used frameworks such as HuggingFace, facebook-research and other key repos which are the sources of key SOTA architecures and models. Its generic, uniform interface provides easy access to most of the SOTA Computer Vision techniques, abstracting away most of the information which can be otherwise ignored by researchers, companies.

2. Modular Interface: auto_labeler aims at building independent, specific modules for each vision task, making it easy to add new algorithms or modify the existing ones as needed.

3. Minimal Touchpoints for Faster Labeling: auto_labeler requires minimal work from the user before labeling process can start. A few configuration changes to choose the architecture, weights to use(along with some hyper-param changes) and a simple to use label.py generic inference wrapper takes care of the rest.

4. Support for a multitude of Vision Tasks: The library supports several vision tasks currently which includes(architectures supported are CLIP, OWL-ViT-V2, SAM-ViT)

Image Classification - with image to image retrieval and text based retrieval
Object Detection - under zero shot text based and promptable image based settings
Instance Segmentation - under zero shot setting
Feature/Keypoint Matching for Image/Instance Retrieval and 2D Image Correspondence applications
Visual Question Answering(VQA) - with the help of vision language models
Optical Character Recognition(OCR) - using Text Detection + Recognition as well as end-to-end learnable architectures.

Installation

Setup repository

git clone [email protected]:mailcorahul/auto_labeler.git
cd auto_labeler/

Create virtualenv python enviroment(preferably above python3.8)

virtualenv -p python3.8 "path to autolabeler environment"
pip install -r requirements.txt

Getting Started

Model Zoo

List of architectures supported for various vision tasks

Visual Question Answering:

Feature Matching:

Local Feature Matching with Transformers(LoFTR)

OCR:

Usage

Image Classification

cd image_classification
python label.py --unlabelled-dump 'path to the unlabelled dataset containing images'
 --class2prompts 'path to a json containing class names along with its text prompts if already known(optional)'
 --result-path 'root folder path to save the auto labeled classification data'

Object Detection

cd object_detection
python label.py --unlabelled-dump 'path to the unlabelled dataset containing images'
 --class-texts-path 'path to a json containing the list of class objects to detect'
 --prompt-images 'path to prompt images for guided one shot detection'
 --result-path 'path to .json file to save the auto labeled detection data'
 --viz 'False(set to True if visualization is required)'
 --viz-path 'path to save detection bbox visualizations'

Instance Segmentation

cd instance_segmentation
python label.py --unlabelled-dump 'path to the unlabelled dataset containing images'
 --class-texts-path 'path to a json containing the list of class objects to segment'
 --result-path 'path to .pkl file to save the auto labeled segmentation data'
 --viz 'False(set to True if visualization is required)'
 --viz-path 'path to save mask visualizations'

Visual Question Answering

cd visual_question_answering
python label.py --unlabelled-dump 'path to the unlabelled dataset containing images to be described'
 --result-path 'path to json containing labeled VQA data'

Feature Matching

cd feature_matching
python label.py --unlabelled-dump 'path to the unlabelled dataset containing images'
--reference-images 'path to reference/index images which is supposed to be retrieved/matched with'
--result-path 'root folder to contain subfolders(for every reference) with retrieved images'

OCR

cd ocr
python label.py --unlabelled-dump 'path to the unlabelled dataset containing document images'
 --result-path 'path to json containing labeled OCR data'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

auto_labeler

Table of Contents:

TODO

Introduction

Features:

Installation

Getting Started

Model Zoo

Image Classification:

Object Detection:

Instance Segmentation:

Visual Question Answering:

Feature Matching:

OCR:

Usage

Image Classification

Object Detection

Instance Segmentation

Visual Question Answering

Feature Matching

OCR

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
feature_matching		feature_matching
image_classification		image_classification
instance_segmentation		instance_segmentation
object_detection		object_detection
ocr		ocr
visual_question_answering		visual_question_answering
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

mailcorahul/auto_labeler

Folders and files

Latest commit

History

Repository files navigation

auto_labeler

Table of Contents:

TODO

Introduction

Features:

Installation

Getting Started

Model Zoo

Image Classification:

Object Detection:

Instance Segmentation:

Visual Question Answering:

Feature Matching:

OCR:

Usage

Image Classification

Object Detection

Instance Segmentation

Visual Question Answering

Feature Matching

OCR

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages