-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build using compute nodes #3131
Labels
feature
New feature or request
Comments
@WalterKolczynski-NOAA will this be an option or the only way to build? |
@CoryMartin-NOAA I intend on making this an option. |
@DavidHuber-NOAA and @WalterKolczynski-NOAA : GDASApp issue #1328 documents a successful compute node (e.g, batch job) build of GDASApp inside g-w |
Fantastic, thanks @RussTreadon-NOAA! |
This was referenced Dec 16, 2024
WalterKolczynski-NOAA
pushed a commit
that referenced
this issue
Dec 24, 2024
This creates scripts to run compute-node builds and also refactors the build_all.sh script to make it easier to build all executables. In place of various options to control what components are built when using `build_all.sh`, instead it takes in a list of one or more systems to build: - `gfs` builds everything needed for forecast-only gfs (UFS model with unstructured wave grid, gfs_utils, ufs_utils, upp, ww3 pre/post for unstructured wave grid) - `gefs` builds everything needed for GEFS (UFS model with structured wave grid, gfs_utils, ufs_utils, upp, ww3 pre/post for structured wave grid) - `sfs` builds everything needed SFS (UFS model in hydrostatic mode with unstructured wave grid, gfs_utils, ufs_utils, upp, ww3 pre/post for structured wave grid) - `gsi` builds GSI-based DA components (gsi_enkf, gsi_monitor, gsi_utils) - `gdas` builds JEDI-based DA components (gdas app, gsi_monitor, gsi_utils) `all` will build all of the above (mostly for testing) Examples: Build for forecast-only GFS: ```./build_all.sh gfs``` Build cycled GFS including coupled DA: ``` ./build_all.sh gfs gsi gdas``` Build GEFS: ```./build_all.sh gefs``` Build everything (for testing purposes): ```./build_all.sh all``` Other options, such as `-d` to build in debug mode, remain unchanged. The full script signature is now: ``` ./build_all.sh [-a UFS_app][-c build_config][-d][-f][-h][-v] [gfs] [gefs] [sfs] [gsi] [gdas] [all] ``` Additionally, there is a new script to build components on the compute nodes using the job scheduler instead of the login node. This method takes the load off of the login nodes and may be faster in some cases. Compute build is invoked using the build_compute.sh script, which behaves similarly to the new `build_all.sh:` ``` ./build_compute.sh [-h][-v][-A <hpc-account>] [gfs] [gefs] [sfs] [gsi] [gdas] [all] ``` Compute build will generate a rocoto workflow and then call `rocotorun` itself repeatedly until either a build fails or all builds succeed, at which point the script will exit. Since the script is calling `rocotorun` itself, you don't need to set up your own cron to do it, but advanced users can also use all the regular rocoto tools on `build.xml` and `build.db` if you wish. Some things to note with the compute build: - When a build fails, other build jobs are not cancelled and will continue to run. - Since the script stops running `rocotorun` once one build fails, the rocoto database will no longer update with the status of the remaining jobs after that point. - Similarly, if the terminal running `build_compute.sh` gets disconnected, the rocoto database will no longer update. - In either of the above cases, you could run `rocotorun` yourself manually to update the database as long as the job information hasn't aged off the scheduler yet. Resolves #3131 --------- Co-authored-by: Rahul Mahajan <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What new functionality do you need?
The capability to build on compute nodes using the job scheduler
What are the requirements for the new functionality?
Build all components using the system job scheduler, whether it be slurm or pbspro. Most programs should be able to build on compute nodes, but the GDASapp will need to build in the service queue for now as it (currently) contacts the outside world to build.
Acceptance Criteria
Suggest a solution (optional)
No response
The text was updated successfully, but these errors were encountered: