Skip to content

Latest commit

 

History

History
16 lines (7 loc) · 375 Bytes

README.md

File metadata and controls

16 lines (7 loc) · 375 Bytes

Visual Benchmarks for Testing LLM's Level of Common Sense

This repository holds a number of benchmark test cases, intended for use as a way to test the level of common sense in in various spaces for a given multimodal LLM (Large Language Model).

Benchmarks for Autonomous Cars

Upcoming.

Benchmarks for Senior Care

Upcoming

Benchmarks for Baby Monitoring