Integrate an evaluation harness #12
Labels
enhancement
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
We will need to test our models against common, industry-standard benchmarks. Pythia is what everyone uses today:
https://github.com/EleutherAI/lm-evaluation-harness
The process will involve:
test.py
to load the model with the Transformers APIThe text was updated successfully, but these errors were encountered: