From 8ae05037fc25ddfde5f3d70c64c38fb27a1eb0a6 Mon Sep 17 00:00:00 2001 From: joesharratt1229 Date: Mon, 3 Mar 2025 10:37:07 +0100 Subject: [PATCH] updated read me --- eval/README.md | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/eval/README.md b/eval/README.md index f7f89c7c..a52c4789 100644 --- a/eval/README.md +++ b/eval/README.md @@ -126,10 +126,23 @@ Options: - `--size`: Default dataset size (default: 100) - `--seed`: Default dataset seed (default: 42) - `--include-params`: Include all configuration parameters (default: False) +- `--category`: Only include datasets from this category (default: None) -### Running Evaluations +#### Generating Config for a Specific Category + +To generate a configuration file containing only datasets from a specific category: + +```bash +python generate_config.py --category algorithmic --output algorithmic_datasets.yaml --model "anthropic/claude-3.5-sonnet" +``` -To run evaluations: +This will create a configuration file that includes only datasets in the "algorithmic" category. This is useful when you want to focus your evaluation on a specific type of reasoning tasks. + +Example categories include: math, arithmetic, reasoning, algorithmic, etc. The category is automatically extracted from the dataset's module name (e.g., from `reasoning_gym.math.dataset_name`, it extracts "math"). + +You can see all available categories by running the script without the `--category` option, as it will print all categories at the end of execution. + +### Running Evaluations ```bash python eval.py --config configs/your_config.yaml