VisionClue

Access the full report at: link
Contributor: Yinuo Zhao, Yuanyi Liu

Overview

VisionClue is a two-stage prompting strategy designed to improve the performance of multi-modal language models on object counting tasks in images. This repository contains the code and documentation used to implement VisionClue, which uses self-generated hints to enhance model accuracy without additional data requirements.

Repository Structure

FSC147_384_V2/: Scripts for preprocessing the FSC147 dataset.
plots/: Contains visualization scripts that generate plots comparing model performance and object counts.
results/: Directory for storing output files from the experiments.
helpers.py: Utility functions used across different scripts.
human_evaluation_gui.py: A GUI tool for manual object counting to compare against model performance.
preprocess_FSC147.py: Preprocessing script for the FSC147 dataset.
rmse_evaluation.py: Script to calculate the RMSE of model predictions against true values.
analysis_summary.py: Summary script that compiles results from various experiments.
gpt4_evaluation.py: Contains the implementation of the GPT-4 model evaluations with different prompting strategies.

Human Evaluation Instructions

To assess human performance:

Execute human_evaluation_gui.py to start the manual counting process.
Results will be saved to FSC147_384_V2/selected_300_image_annotation.csv in the last column labeled "human".
A message "No more images to label." indicates the completion of all 300 image assessments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VisionClue

Overview

Repository Structure

Human Evaluation Instructions

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
FSC147_384_V2		FSC147_384_V2
__pycache__		__pycache__
archive		archive
plots		plots
results		results
README.md		README.md
analysis_summary.py		analysis_summary.py
gpt4_evaluation.py		gpt4_evaluation.py
helpers.py		helpers.py
human_evaluation_gui.py		human_evaluation_gui.py
preprocess_FSC147.py		preprocess_FSC147.py
report.pdf		report.pdf
rmse_evaluation.py		rmse_evaluation.py

inorrr/multimodal_llm_visual_estimation

Folders and files

Latest commit

History

Repository files navigation

VisionClue

Overview

Repository Structure

Human Evaluation Instructions

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages