Pile 10k new task (EleutherAI#1758)

* Add Pile-10k readme * Add Pile-10k task configuration file
OpenLLM-France · May 1, 2024 · b898bda · b898bda
1 parent 552eeae
commit b898bda
Show file tree

Hide file tree

Showing 2 changed files with 64 additions and 0 deletions.
diff --git a/lm_eval/tasks/pile_10k/README.md b/lm_eval/tasks/pile_10k/README.md
@@ -0,0 +1,45 @@
+# Pile-10k
+
+### Paper
+
+Title: `NeelNanda/pile-10k`
+
+Abstract: The first 10K elements of [The Pile](https://pile.eleuther.ai/), useful for debugging models trained on it. See the [HuggingFace page for the full Pile](https://huggingface.co/datasets/the_pile) for more info. Inspired by [stas' great resource](https://huggingface.co/datasets/stas/openwebtext-10k) doing the same for OpenWebText
+
+Homepage: [https://huggingface.co/datasets/NeelNanda/pile-10k](https://huggingface.co/datasets/NeelNanda/pile-10k)
+
+
+### Citation
+
+```
+@misc{Nanda2022Pile10K,
+  author = {Nanda, Neel},
+  title = {{NeelNanda/pile-10k} \textendash\ Datasets at Hugging Face},
+  year = {2022},
+  howpublished = {\url{https://huggingface.co/datasets/NeelNanda/pile-10k}},
+}
+```
+
+### Groups and Tasks
+
+#### Groups
+
+* Not part of a group yet.
+
+
+#### Tasks
+
+* `pile_10k`: `The first 10K elements of The Pile, useful for debugging models trained on it.`
+
+### Checklist
+
+For adding novel benchmarks/datasets to the library:
+* [ ] Is the task an existing benchmark in the literature?
+  * [ ] Have you referenced the original paper that introduced the task?
+  * [ ] If yes, does the original paper provide a reference implementation? If so, have you checked against the reference implementation and documented how to run such a test?
+
+
+If other tasks on this dataset are already supported:
+* [ ] Is the "Main" variant of this task clearly denoted?
+* [ ] Have you provided a short sentence in a README on what each new variant adds / evaluates?
+* [ ] Have you noted which, if any, published evaluation setups are matched by this variant?
diff --git a/lm_eval/tasks/pile_10k/pile_10k.yaml b/lm_eval/tasks/pile_10k/pile_10k.yaml
@@ -0,0 +1,19 @@
+task: pile_10k
+dataset_path: NeelNanda/pile-10k
+dataset_name: null
+output_type: loglikelihood_rolling
+test_split: train
+doc_to_text: ""
+doc_to_target: "text"
+metric_list:
+  - metric: word_perplexity
+    aggregation: weighted_perplexity
+    higher_is_better: false
+  - metric: byte_perplexity
+    aggregation: weighted_perplexity
+    higher_is_better: false
+  - metric: bits_per_byte
+    aggregation: bits_per_byte
+    higher_is_better: false
+metadata:
+  version: 1.0