Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INSTALL] Update CRTM-2.4.0 on all HPC machines #519

Open
emilyhcliu opened this issue Apr 21, 2023 · 14 comments
Open

[INSTALL] Update CRTM-2.4.0 on all HPC machines #519

emilyhcliu opened this issue Apr 21, 2023 · 14 comments
Assignees

Comments

@emilyhcliu
Copy link

emilyhcliu commented Apr 21, 2023

The crtm version 2.4.0 installed under hpc-stack: /apps/contrib/NCEP/libs/hpc-stack/modulefiles/stack is outdated and needs an update.

**Issue #517 is related to this issue.

Which software in the stack would you like installed?
crtm version 2.4.0 and related coefficient files

What is the version/tag of the software?
release/REL-2.4.0_emc

What compilation options would you like set?
intel-2018.4

Which machines would you like to have the software installed?
All HPC machines other than HERA
HERA already updated.

Any other relevant information that we should know to correctly install the software??

Additional context
Question: For ORION, the hpc-stack is the one under active maintenance, the hpc-stack-gfsv16 is not, correct?

@jkbk2004
Copy link

@emilyhcliu EPIC has been supporting hpc stack on Orion at /work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2022.1.2. Regarding hpc-stack-gfsv16, there can be permission issue (EPIC side) to access /apps/contrib/NCEP/libs/hpc-stack/modulefiles/stack. By any chance, is migration of hpc-stack-gfsv16 possible to somewhere EPIC location ? @natalie-perlin FYI

@emilyhcliu emilyhcliu changed the title [INSTALL] Update CRTM-2.4.0 on ORION [INSTALL] Update CRTM-2.4.0 on all HPC machines Apr 21, 2023
@emilyhcliu
Copy link
Author

@jkbk2004 GSI is having issue at run time when it is compiled with intel-2022. The issue is tracked in NOAA-EMC/GSI#447. So, we can not use the libraries under intel-2022.

For GSI develop, we would like to move to EPIC HPC stacks (hopefully, we can resolve the issue with intel-2022)
For GSI release/gfsda.v16 (currently used operational systems), we would like to stay with the hpc-stacks.

So, I think there are two options:
(1) make intel-2018 also available under EPIC HPC
(2) update the hpc-stack

Any thoughts?

@jkbk2004
Copy link

@natalie-perlin can we add the hpc stack option on orion epic space for this gsi requirement?

@natalie-perlin
Copy link
Collaborator

natalie-perlin commented Apr 26, 2023

@jkbk2004 @emilyhcliu - The intel-18 modules for the GSI may only be helpful as a debugging step, until the issue with higher-version intel compilers is solved. This however may not be a community-recommended approach of using different compilers to build different parts of the UFS Apps...

@DavidHuber-NOAA
Copy link
Contributor

@natalie-perlin The GSI cannot run with intel 2021+ on any system until the above mentioned issue is resolved. I think everyone agrees that it would be ideal for everything to move to Intel 2022, but, unfortunately, this is not possible for the GSI yet. So all GSI dependencies, including CRTM, need to be compiled with Intel 18 for the time being on all systems.

@natalie-perlin
Copy link
Collaborator

@DavidHuber-NOAA - what about the cases with GNU compilers? EPIC supports software stacks with gnu compilers on Hera and Cheyenne that are built to support UFS-WM and UFS-SRW

@DavidHuber-NOAA
Copy link
Contributor

@natalie-perlin I believe these would be required as well, though I can't say with certainty. I've only been helping with the Intel 2022 issue and am not an authority on the GSI otherwise.

@natalie-perlin
Copy link
Collaborator

natalie-perlin commented Apr 27, 2023

The stack for GSI modules built with intel-2018.4 + impi/2018.4 compilers is ready on Orion.
Modules that are listed in gsi_common.lua and
gsi_orion.lua are built.

The way to load:

module use /work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2018.4/modulefiles/stack
module load hpc/1.2.0

The lines 4-6 in gsi_orion.lua would then become:

prepend_path("MODULEPATH", "/work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2018.4/modulefiles/stack")

local hpc_ver=os.getenv("hpc_ver") or "1.2.0"

Update: Alternatives built: w3emc/2.9.1, w3emc/2.9.2

The identical stack is being built with intel-2022.1.2 compiler, which hopefully could be used for debugging purposes for the > intel/2020 compilers. (Fingers crossed)

Please let us know if you have any comments on the modules built or needed to be built.

@natalie-perlin
Copy link
Collaborator

HPC-stack with intel/2022.1.2 compiler on Orion:

module use /work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2022.1.2_gsi/modulefiles/stack
module load hpc/1.2.0

@natalie-perlin
Copy link
Collaborator

@jkbk2004 -
The crtm/2.4.0 fix files have been updated on all the NOAA RDHPC systems. The updated CRTM-2.4.0 code that does not have the excessive printout statements as mentioned in GSI Issue-556 has only been solved in the newer EPIC-maintained hpc-stacks that are based off netcdf-4.9.2. EPIC's set of stacks with netcdf-4.7.4 still use the library version built with excessive printouts.
I'd like to update crtm/2.4.0 in these current stacks, as this was raised as an issue by the GSI team. When is the best time to do the update to avoid disruption to any RT testings (weekend, early mornings, after the PR-1745)? WM may move the the updated netcdf-4.9.2 -based stacks, as in ufs-community/ufs-weather-model#1745 that are free of excessive printout. But other repositories, such as GSI, global_workflow, UFS_UTILS, SRW, etc, may still be using older, netcdf-4.7.4-based stack builds for some time.

@natalie-perlin
Copy link
Collaborator

natalie-perlin commented May 12, 2023

@emilyhcliu @jkbk2004 - I will plan to fully update the CRTM-2.4.0 code with the new code that contains a bug-fix in all EPIC-maintained hpc-stacks that are built with netcdf/4.7.4 over the weekend, when it is unlikely to interfere with the WM and SRW tests. So far, the update has been done to the newer stacks built with netcdf/4.9.2.

A stack with the intel/2018.4 on Orion has just have been built recently (May 2, 2023), and it uses the recent code with the CRTM-2.4.0.:
/work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2018.4
The same is true for the limited-library stack build for the GSI team on Orion, with intel/2022.1.2 compiler, in
/work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2022.1.2_gsi

The crtm/2.4.0 stack update would require rebuilding a upp library as well, as a dependency on crtm. Will notify here when done.

@natalie-perlin
Copy link
Collaborator

All the active and current EPIC stacks have been updated with the latest CRTM/2.4.0 and corresponding CRTM fix files. Please see below the stack locations:
Hera intel/2022.1.2: /scratch1/NCEPDEV/nems/role.epic/hpc-stack/libs/intel-2022.1.2,
/scratch1/NCEPDEV/nems/role.epic/hpc-stack/libs/intel-2022.1.2_ncdf492

Hera gnu/9.2.0: /scratch1/NCEPDEV/nems/role.epic/hpc-stack/libs/gnu-9.2,
/scratch1/NCEPDEV/nems/role.epic/hpc-stack/libs/gnu-9.2_ncdf492

Orion intel/2022.1.2: /work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2022.1.2, /work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2022.1.2_ncdf492/

Orion intel/2018.4: /work/noaa/epic-ps/role-epic-ps/hpc-stack/libs/intel-2018.4

Jet intel/2022.1.2: /mnt/lfs4/HFIP/hfv3gfs/role.epic/hpc-stack/libs/intel-2022.1.2,
/mnt/lfs4/HFIP/hfv3gfs/role.epic/hpc-stack/libs/intel-2022.1.2_ncdf492

Jet intel/2018: /mnt/lfs4/HFIP/hfv3gfs/role.epic/hpc-stack/libs/intel-18.0.5.274

Cheyenne intel/2022.1: /glade/work/epicufsrt/contrib/hpc-stack/intel2022.1,
/glade/work/epicufsrt/contrib/hpc-stack/src-intel2022.1_ncdf492

Cheyenne gnu/10.1.0: /glade/work/epicufsrt/contrib/hpc-stack/gnu10.1.0, /glade/work/epicufsrt/contrib/hpc-stack//gnu10.1.0_ncdf49

Gaea intel-classic/2022.2.1: /lustre/f2/dev/role.epic/contrib/hpc-stack/intel-classic-2022.2.1

@BijuThomas-NOAA
Copy link

@DavidHuber-NOAA and @emilyhcliu Wondering if anybody tries to run GSI with spack-stack (stack-intel/2021.7.1) on Hercules. I have issues during running while it compiled successfully on Hercules.

@DavidHuber-NOAA
Copy link
Contributor

@BijuThomas-NOAA No, I have not tried yet. The GSI does not yet run with Intel 2021+ (NOAA-EMC/GSI#447 NOAA-EMC/GSI#571), but I have it working on Orion and am actively working on it on Hera with an apparent communication problem on Hera. @natalie-perlin has gotten it to work on Gaea and is actively working on Cheyenne. After that, we could perhaps try out spack stack and then Hercules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants