Skip to content

Commit

Permalink
Update tutorials/distributed_data_classification/fineweb-edu-ensebmle…
Browse files Browse the repository at this point in the history
…-classification.ipynb

Co-authored-by: Sarah Yurick <[email protected]>
Signed-off-by: Vibhu Jawa <[email protected]>
  • Loading branch information
VibhuJawa and sarahyurick authored Feb 10, 2025
1 parent 0b45a21 commit 2eaf955
Showing 1 changed file with 2 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@
"### Ensembling `FineWeb Mixtral Educational Classifier`, `FineWeb Nemotron-4 Educational Classifier`, and `fasttext-oh-eli5`\n",
"\n",
"This notebook demonstrates distributed data classification by ensembling:\n",
"1. NeMo Curator’s [`FineWeb Mixtral Educational Classifier`](TODO)\n",
"2. NeMo Curator’s [`FineWeb Nemotron-4 Educational Classifier`](TODO)\n",
"1. NeMo Curator’s [`FineWebMixtralEduClassifier`](https://huggingface.co/nvidia/nemocurator-fineweb-mixtral-edu-classifier)\n",
"2. NeMo Curator’s [`FineWebNemotronEduClassifier`](https://huggingface.co/nvidia/nemocurator-fineweb-nemotron-4-edu-classifier)\n",
"3. Fast Text's [`fasttext-oh-eli5`](https://huggingface.co/mlfoundations/fasttext-oh-eli5) from Hugging Face.\n",
"\n",
"The FineWeb educational classifiers (excluding FastText) leverage [CrossFit](https://github.com/rapidsai/crossfit), a RAPIDS-accelerated library for intelligent batching, to enhance offline inference performance on large datasets.\n",
Expand Down

0 comments on commit 2eaf955

Please sign in to comment.