Skip to content

Commit

Permalink
fix: lower num_workers to 4
Browse files Browse the repository at this point in the history
For multi-task training in pytorch, each data source will have their own dataloader. If the number of workers of dataloaders is large, there will be many worker processes stressing CPU.

Signed-off-by: Chun Cai <[email protected]>
  • Loading branch information
caic99 authored Jan 6, 2025
1 parent 8d4c27b commit c7435a8
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion deepmd/pt/utils/env.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
ncpus = len(os.sched_getaffinity(0))
except AttributeError:
ncpus = os.cpu_count()
NUM_WORKERS = int(os.environ.get("NUM_WORKERS", min(8, ncpus)))
NUM_WORKERS = int(os.environ.get("NUM_WORKERS", min(4, ncpus)))
# Make sure DDP uses correct device if applicable
LOCAL_RANK = os.environ.get("LOCAL_RANK")
LOCAL_RANK = int(0 if LOCAL_RANK is None else LOCAL_RANK)
Expand Down

0 comments on commit c7435a8

Please sign in to comment.