QUEST-AI: A System for Question Generation, Verification, and Refinement using AI for USMLE-Style Exams
QUEST-AI is an innovative system designed to generate, verify, and refine USMLE-style exam questions using Large Language Models (LLMs), specifically GPT-4. This system aims to streamline the development of medical exam content, offering a cost-effective and efficient alternative for creating study materials and practice questions for the United States Medical Licensing Examination (USMLE).
- Question Generation: Utilizes GPT-4 to generate USMLE-style questions.
- Verification: An ensemble of LLMs identifies and flags potentially incorrect questions.
- Refinement: GPT-4 refines flagged questions to improve accuracy and validity.
- data/: Contains datasets, including both AI-generated and human-generated questions.
- manuscript/: Drafts and related documents for the research paper.
- notebooks/: Jupyter notebooks for data analysis and evaluation.
- src/: Source code for the QUEST-AI system.
Clone the repository:
git clone https://github.com/som-shahlab/gpt4usmle.git
To do inference by ensemble of LLMs, run:
bash inference_single_model.sh
To fix or refine incorrect questions generated by GPT-4, run:
python fix_incorrect_questions.py
To categotize questions based on the USMLE content categories, run:
python classify_questions.py