jarchive/APS_README.md at master · sanzgiri/jarchive · GitHub

Cloned from git://github.com/whymarrh/jeopardy-parser.git
Original has been modified to not write to sql db and use '||' as a separator
This allows easier loading into a pandas dataframe for subsequent cleaning

pip install -r requirements.txt
python download.py <archive_dir> <starting_game_to_download>
python parser.py -d <archive_dir> > jarchive_xxx.csv
cat jarchive_xxx.csv > jarchive.csv

This first downloads the games into directory j-archive (currently has games upto id 6095, which is 7/27/18)
parser.py then extracts questions to jarchive.csv