Skip to content

Actions: huggingface/datatrove

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
1,280 workflow runs
1,280 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Fix issues with URL Deduplication when using the Index
Test & Check Code Quality #385: Pull request #327 synchronize by muzzynine
January 22, 2025 12:26 3m 17s muzzynine:fix_url_dedup
January 22, 2025 12:26 3m 17s
Update README.md (#323)
Test & Check Code Quality #384: Commit 8063aed pushed by guipenedo
January 22, 2025 11:15 2m 49s main
January 22, 2025 11:15 2m 49s
Update README.md (#323)
Secret Leaks #206: Commit 8063aed pushed by guipenedo
January 22, 2025 11:15 18s main
January 22, 2025 11:15 18s
Fix issues with URL Deduplication when using the Index
Test & Check Code Quality #383: Pull request #327 opened by muzzynine
January 22, 2025 09:17 Action required muzzynine:fix_url_dedup
January 22, 2025 09:17 Action required
fixes stopwors implementation...
Secret Leaks #205: Commit f8e78f5 pushed by guipenedo
January 20, 2025 15:49 18s stopwords_set
January 20, 2025 15:49 18s
Add customization for fetching SLURM job id
Test & Check Code Quality #381: Pull request #320 synchronize by BramVanroy
January 10, 2025 15:38 3m 11s BramVanroy:main
January 10, 2025 15:38 3m 11s
Add customization for fetching SLURM job id
Test & Check Code Quality #380: Pull request #320 opened by BramVanroy
January 10, 2025 15:32 Action required BramVanroy:main
January 10, 2025 15:32 Action required
fix(utils): Enhance the dependencies check to include pip distributio…
Secret Leaks #204: Commit 2260603 pushed by guipenedo
January 9, 2025 18:31 17s main
January 9, 2025 18:31 17s
fix(utils): Enhance the dependencies check to include pip distributio…
Test & Check Code Quality #379: Commit 2260603 pushed by guipenedo
January 9, 2025 18:31 22s main
January 9, 2025 18:31 22s
fix(utils): Enhance the dependencies check to include pip distribution
Test & Check Code Quality #378: Pull request #317 synchronize by guipenedo
January 9, 2025 18:24 19s aiqwe:main
January 9, 2025 18:24 19s
Add glob pattern for hash index (#313)
Secret Leaks #203: Commit cd61018 pushed by guipenedo
January 9, 2025 12:47 22s main
January 9, 2025 12:47 22s
Add glob pattern for hash index (#313)
Test & Check Code Quality #377: Commit cd61018 pushed by guipenedo
January 9, 2025 12:47 2m 32s main
January 9, 2025 12:47 2m 32s
style fix
Secret Leaks #202: Commit b9b24cf pushed by guipenedo
January 9, 2025 12:47 22s decont-glob
January 9, 2025 12:47 22s
nit
Secret Leaks #201: Commit 3168cf5 pushed by guipenedo
January 9, 2025 12:39 17s main
January 9, 2025 12:39 17s
nit
Test & Check Code Quality #375: Commit 3168cf5 pushed by guipenedo
January 9, 2025 12:39 3m 18s main
January 9, 2025 12:39 3m 18s
clean up PipelineStepWithTokenizer
Secret Leaks #200: Commit 66221c8 pushed by guipenedo
January 9, 2025 12:38 18s main
January 9, 2025 12:38 18s
clean up PipelineStepWithTokenizer
Test & Check Code Quality #374: Commit 66221c8 pushed by guipenedo
January 9, 2025 12:38 2m 43s main
January 9, 2025 12:38 2m 43s
load_tokenizer can now load local hf folder (#306)
Test & Check Code Quality #373: Commit 8adc0b9 pushed by guipenedo
January 9, 2025 12:31 3m 33s main
January 9, 2025 12:31 3m 33s
load_tokenizer can now load local hf folder (#306)
Secret Leaks #199: Commit 8adc0b9 pushed by guipenedo
January 9, 2025 12:31 18s main
January 9, 2025 12:31 18s
Add job_id_position Parameter to launch_slurm_job Method (#282)
Test & Check Code Quality #372: Commit 2fc7660 pushed by guipenedo
January 9, 2025 12:30 3m 29s main
January 9, 2025 12:30 3m 29s
Add job_id_position Parameter to launch_slurm_job Method (#282)
Secret Leaks #198: Commit 2fc7660 pushed by guipenedo
January 9, 2025 12:30 18s main
January 9, 2025 12:30 18s
Adding Megatron Tokenization pipeline (#304)
Test & Check Code Quality #371: Commit 338b3ad pushed by guipenedo
January 9, 2025 12:18 2m 39s main
January 9, 2025 12:18 2m 39s
Adding Megatron Tokenization pipeline (#304)
Secret Leaks #197: Commit 338b3ad pushed by guipenedo
January 9, 2025 12:18 18s main
January 9, 2025 12:18 18s
fix capitalization edge cases + add cache
Secret Leaks #196: Commit 0489711 pushed by guipenedo
January 9, 2025 11:46 21s aiqwe/main
January 9, 2025 11:46 21s
changed ftfy defaults (#319)
Secret Leaks #195: Commit 0c891f6 pushed by guipenedo
January 8, 2025 20:13 23s main
January 8, 2025 20:13 23s