-
Notifications
You must be signed in to change notification settings - Fork 11
How Many Nodes and Processors Should I Use?
The best number of nodes and tasks (or processors) for a simulation depends on many aspects, including:
- Number of ensemble members
- Number of tiles in domain
- Domain size
- Resolution
- Tile space (cube-sphere vs. EASE)
- I/O configuration (HISTORY, restarts)
- Hardware details (node type)
Requesting more nodes or tasks does not necessarily speed up a simulation and may just waste resources.
The following processor layouts for global domains can serve as a rough guide for NCCS/Discover:
---------------------------------------------------------------------------------
| N_ens | Resolution | # Tiles | Node Type | # Nodes | # Tasks/Node |
---------------------------------------------------------------------------------
| 1 | EASEv2_M36 | ~100k | cas | 1-2 | 46 |
| 1 | EASEv2_M09 | ~1,600k | cas | 1-3 | 46 |
| 1 | EASEv2_M09 | ~1,600k | mil | 1-2 | 126 |
| 1 | CF90 | ~475k | cas | 2 | 46 |
---------------------------------------------------------------------------------
| 24 | EASEv2_M09 | ~1,600k | cas | 1-3 | 46 |
| 24 | EASEv2_M09 | ~1,600k | mil | 1-2 | 126 |
---------------------------------------------------------------------------------
Typically, data assimilation does not substantially alter the cost of a simulation or its scaling behavior beyond the need for an ensemble simulation.
These suggestions are based on wall-time requirements for several simulations that are configured like the LDAS_GLOBAL/model and LDAS_GLOBAL/assim test cases, except for the processor layout and the length of the simulation. The graphics below plot wall time vs number of tasks (excluding time for initialization, finalization, and post-processing).