Skip to content

Actions: sarahyurick/NeMo-Curator

Test Python package

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
77 workflow runs
77 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Minor updates to duplicate removal (#570)
Test Python package #77: Commit a080400 pushed by sarahyurick
February 26, 2025 19:21 5m 31s main
February 26, 2025 19:21 5m 31s
Improvements for semantic deduplication and DAPT tutorial (#564)
Test Python package #76: Commit 119edd4 pushed by sarahyurick
February 24, 2025 22:16 6m 50s main
February 24, 2025 22:16 6m 50s
Update get_all_files_paths_under examples to include `keep_extensio…
Test Python package #75: Commit 9b1a13c pushed by sarahyurick
February 20, 2025 20:21 5m 28s main
February 20, 2025 20:21 5m 28s
Fix issues with download and extract (#541)
Test Python package #74: Commit 908e0f1 pushed by sarahyurick
February 18, 2025 17:46 5m 41s main
February 18, 2025 17:46 5m 41s
Add notebook to show Fineweb ensemble (#536)
Test Python package #73: Commit 0f0cb31 pushed by sarahyurick
February 14, 2025 21:57 5m 52s main
February 14, 2025 21:57 5m 52s
chore: Version bump (#545)
Test Python package #72: Commit a5d1a7b pushed by sarahyurick
February 12, 2025 23:48 6m 40s main
February 12, 2025 23:48 6m 40s
Add support for Nemotron-CC EDU classifiers (#518)
Test Python package #71: Commit a7fde15 pushed by sarahyurick
February 12, 2025 22:59 5m 52s main
February 12, 2025 22:59 5m 52s
Pin Transformers version >= 4.48.0 (#528)
Test Python package #70: Commit 334a331 pushed by sarahyurick
February 11, 2025 19:06 5m 25s main
February 11, 2025 19:06 5m 25s
Update model nomenclature (#497)
Test Python package #69: Commit 34a1cc6 pushed by sarahyurick
February 7, 2025 17:57 5m 9s main
February 7, 2025 17:57 5m 9s
ci: Version bump to 0.7.0rc1.dev0 (#513)
Test Python package #68: Commit c3fb61d pushed by sarahyurick
February 4, 2025 23:14 8m 18s main
February 4, 2025 23:14 8m 18s
Fix DAPT tutorial (#503)
Test Python package #67: Commit 75234a9 pushed by sarahyurick
January 31, 2025 20:56 5m 19s main
January 31, 2025 20:56 5m 19s
Update fuzzy deduplication to skip false positive checks as the defau…
Test Python package #66: Commit fe41ac1 pushed by sarahyurick
January 30, 2025 22:56 5m 13s main
January 30, 2025 22:56 5m 13s
Create notebook tutorials for distributed data classifiers (#415)
Test Python package #65: Commit cd38de0 pushed by sarahyurick
January 24, 2025 21:58 5m 24s main
January 24, 2025 21:58 5m 24s
Create check_dask_cwd function (#484)
Test Python package #64: Commit 57f0e3c pushed by sarahyurick
January 23, 2025 00:00 4m 22s main
January 23, 2025 00:00 4m 22s
[REVIEW] Fix Sem Dedup (#478)
Test Python package #63: Commit 7cfda44 pushed by sarahyurick
January 16, 2025 20:23 4m 46s main
January 16, 2025 20:23 4m 46s
docs: Update CHANGELOG.md (#475)
Test Python package #62: Commit 9c8f185 pushed by sarahyurick
January 10, 2025 20:55 4m 23s main
January 10, 2025 20:55 4m 23s
Make add_filename str/bool (#465)
Test Python package #61: Commit 2d7e857 pushed by sarahyurick
January 7, 2025 19:52 4m 23s main
January 7, 2025 19:52 4m 23s
Reorder import (#460)
Test Python package #60: Commit db411b0 pushed by sarahyurick
January 2, 2025 22:23 4m 21s main
January 2, 2025 22:23 4m 21s
Add tests/test_classifiers.py PyTests (#421)
Test Python package #59: Commit b8ff71e pushed by sarahyurick
December 23, 2024 21:09 4m 35s main
December 23, 2024 21:09 4m 35s
Bug fix in dockerfile ARG vs ENV var (#446)
Test Python package #58: Commit 35b5993 pushed by sarahyurick
December 23, 2024 18:28 4m 28s main
December 23, 2024 18:28 4m 28s
update test params to account for new minhash algo (#442)
Test Python package #57: Commit c929203 pushed by sarahyurick
December 20, 2024 19:00 5m 28s main
December 20, 2024 19:00 5m 28s
Add blocksize to DocumentDataset.read_* that uses `dask_cudf.read_*…
Test Python package #56: Commit e820b8b pushed by sarahyurick
December 17, 2024 21:02 4m 45s main
December 17, 2024 21:02 4m 45s
Bump RAPIDS stable to 24.12 and RAPIDS nightly to 25.02 (#434)
Test Python package #55: Commit c54826a pushed by sarahyurick
December 17, 2024 20:58 4m 24s main
December 17, 2024 20:58 4m 24s
Add documentation for Instruction-Data-Guard classifier (#398)
Test Python package #54: Commit 86830ab pushed by sarahyurick
December 16, 2024 18:43 1m 43s main
December 16, 2024 18:43 1m 43s
Adding fuzzy and semantic dedupe (#428)
Test Python package #53: Commit 3c3cc98 pushed by sarahyurick
December 13, 2024 22:42 1m 48s main
December 13, 2024 22:42 1m 48s