Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ctests for running GETKF in split mode and move reference files into repo #175

Merged

Conversation

SamuelDegelia-NOAA
Copy link
Contributor

@SamuelDegelia-NOAA SamuelDegelia-NOAA commented Sep 17, 2024

This PR adds ctests and yaml files for running GETKF in split observer and solver mode. Running GETKF in this form is currently the most efficient option since the observer can be run using the RoundRobin distribution (see #122). Reference files are also added for these two new ctests and are added into the repo (see below).

There are a couple of additional changes included here:

  • All ctests now run in the same directory instead of individual directories for each test. This is done so that the GETKF solver ctest can access the hofx file output by the observer ctest.
  • The ctest reference files are moved into the repo instead of being linked in from RDAS_DATA. This will make it easier to track these small files and for other developers to add/modify them.

Ctest output from a fresh clone:

(eva) [Samuel.Degelia@hfe11 rrfs-test]$ ctest
Test project /scratch1/BMC/zrtrr/Samuel.Degelia/RDASApp_dev/RDASApp/build/rrfs-test
    Start 1: rrfs_fv3jedi_hyb_2022052619
1/8 Test #1: rrfs_fv3jedi_hyb_2022052619 ...............   Passed  146.04 sec
    Start 2: rrfs_fv3jedi_letkf_2022052619
2/8 Test #2: rrfs_fv3jedi_letkf_2022052619 .............   Passed   46.69 sec
    Start 3: rrfs_mpasjedi_2024052700_Ens3Dvar
3/8 Test #3: rrfs_mpasjedi_2024052700_Ens3Dvar .........   Passed  114.99 sec
    Start 4: rrfs_mpasjedi_2024052700_letkf
4/8 Test #4: rrfs_mpasjedi_2024052700_letkf ............   Passed   41.88 sec
    Start 5: rrfs_mpasjedi_2024052700_getkf
5/8 Test #5: rrfs_mpasjedi_2024052700_getkf ............   Passed   71.20 sec
    Start 6: rrfs_mpasjedi_2024052700_getkf_observer
6/8 Test #6: rrfs_mpasjedi_2024052700_getkf_observer ...   Passed   48.46 sec
    Start 7: rrfs_mpasjedi_2024052700_getkf_solver
7/8 Test #7: rrfs_mpasjedi_2024052700_getkf_solver .....   Passed   78.14 sec
    Start 8: rrfs_mpasjedi_2024052700_bumploc
8/8 Test #8: rrfs_mpasjedi_2024052700_bumploc ..........   Passed  146.61 sec

100% tests passed, 0 tests failed out of 8

Label Time Summary:
mpi            = 694.01 sec*proc (8 tests)
rdas-bundle    = 694.01 sec*proc (8 tests)
script         = 694.01 sec*proc (8 tests)

Total Test time (real) = 695.06 sec

guoqing-noaa
guoqing-noaa previously approved these changes Sep 18, 2024
Copy link
Collaborator

@guoqing-noaa guoqing-noaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. But I will do a fresh test on Jet and update the results in about 40 minutes.

@SamuelDegelia-NOAA
Copy link
Contributor Author

LGTM. But I will do a fresh test on Jet and update the results in about 40 minutes.

Could you do a test with ctest -j8 too since that has been a point of discussion in this PR? Would like to confirm that it works on other machines too. Thanks!

@guoqing-noaa
Copy link
Collaborator

LGTM. But I will do a fresh test on Jet and update the results in about 40 minutes.

Could you do a test with ctest -j8 too since that has been a point of discussion in this PR? Would like to confirm that it works on other machines too. Thanks!

Yes, I think it is a good idea to run ctest -j8 by default for any future rdas_build_tests.

@guoqing-noaa
Copy link
Collaborator

Well, @SamuelDegelia-NOAA could you modify the last line of ush/run_rrfs_tests.sh to ctest -j8 in this PR? Thanks!

@SamuelDegelia-NOAA
Copy link
Contributor Author

Well, @SamuelDegelia-NOAA could you modify the last line of ush/run_rrfs_tests.sh to ctest -j8 in this PR? Thanks!

Done!

@guoqing-noaa
Copy link
Collaborator

guoqing-noaa commented Sep 18, 2024

all rrfs tests passed when running from a fresh clone/build on Jet:
/mnt/lfs5/BMC/wrfruc/gge/tmp/rdas_build_test/RDASApp_SamuelDegelia-NOAA_feature_split_getkf_test

The ctest dependency worked as expected.

guoqing-noaa
guoqing-noaa previously approved these changes Sep 18, 2024
@SamuelDegelia-NOAA
Copy link
Contributor Author

We had some internal discussion today at EMC and would like to keep individual run directories for each ctest. I will update this PR today to account for this change. It should not change much but we will need to retest.

@SamuelDegelia-NOAA SamuelDegelia-NOAA marked this pull request as draft September 19, 2024 17:41
@rrfsbot
Copy link
Collaborator

rrfsbot commented Sep 26, 2024

started build_and_test on jet at UTC time: Thu Sep 26 20:11:08 UTC 2024
finished at UTC time: Thu Sep 26 20:42:12 UTC 2024

Test project /lfs5/BMC/wrfruc/rrfsbot/PRs_RDASApp/175/build/rrfs-test
    Start 6: rrfs_mpasjedi_2024052700_getkf_observer
    Start 1: rrfs_fv3jedi_hyb_2022052619
    Start 2: rrfs_fv3jedi_letkf_2022052619
    Start 3: rrfs_mpasjedi_2024052700_Ens3Dvar
    Start 4: rrfs_mpasjedi_2024052700_letkf
    Start 5: rrfs_mpasjedi_2024052700_getkf
    Start 8: rrfs_mpasjedi_2024052700_bumploc
1/8 Test #2: rrfs_fv3jedi_letkf_2022052619 .............   Passed   57.98 sec
2/8 Test #1: rrfs_fv3jedi_hyb_2022052619 ...............   Passed  112.35 sec
3/8 Test #3: rrfs_mpasjedi_2024052700_Ens3Dvar .........   Passed  211.49 sec
4/8 Test #6: rrfs_mpasjedi_2024052700_getkf_observer ...   Passed  219.55 sec
    Start 7: rrfs_mpasjedi_2024052700_getkf_solver
5/8 Test #4: rrfs_mpasjedi_2024052700_letkf ............   Passed  260.97 sec
6/8 Test #5: rrfs_mpasjedi_2024052700_getkf ............   Passed  280.20 sec
7/8 Test #7: rrfs_mpasjedi_2024052700_getkf_solver .....   Passed  106.90 sec
8/8 Test #8: rrfs_mpasjedi_2024052700_bumploc ..........   Passed  372.53 sec

100% tests passed, 0 tests failed out of 8

Label Time Summary:
mpi            = 1621.98 sec*proc (8 tests)
rdas-bundle    = 1621.98 sec*proc (8 tests)
script         = 1621.98 sec*proc (8 tests)

Total Test time (real) = 372.55 sec

workdir: /lfs5/BMC/wrfruc/rrfsbot/PRs_RDASApp/175

@rrfsbot
Copy link
Collaborator

rrfsbot commented Sep 26, 2024

started build_and_test on hercules at UTC time: Thu Sep 26 20:10:54 UTC 2024
finished at UTC time: Thu Sep 26 20:44:20 UTC 2024

Test project /work/noaa/wrfruc/rrfsbot/PRs_RDASApp/175/build/rrfs-test
    Start 6: rrfs_mpasjedi_2024052700_getkf_observer
    Start 1: rrfs_fv3jedi_hyb_2022052619
    Start 2: rrfs_fv3jedi_letkf_2022052619
    Start 3: rrfs_mpasjedi_2024052700_Ens3Dvar
    Start 4: rrfs_mpasjedi_2024052700_letkf
    Start 5: rrfs_mpasjedi_2024052700_getkf
    Start 8: rrfs_mpasjedi_2024052700_bumploc
1/8 Test #2: rrfs_fv3jedi_letkf_2022052619 .............   Passed  124.32 sec
2/8 Test #6: rrfs_mpasjedi_2024052700_getkf_observer ...   Passed  136.52 sec
    Start 7: rrfs_mpasjedi_2024052700_getkf_solver
3/8 Test #3: rrfs_mpasjedi_2024052700_Ens3Dvar .........   Passed  146.70 sec
4/8 Test #4: rrfs_mpasjedi_2024052700_letkf ............   Passed  152.92 sec
5/8 Test #5: rrfs_mpasjedi_2024052700_getkf ............   Passed  162.17 sec
6/8 Test #7: rrfs_mpasjedi_2024052700_getkf_solver .....   Passed   43.47 sec
7/8 Test #1: rrfs_fv3jedi_hyb_2022052619 ...............   Passed  192.92 sec
8/8 Test #8: rrfs_mpasjedi_2024052700_bumploc ..........   Passed  211.25 sec

100% tests passed, 0 tests failed out of 8

Label Time Summary:
mpi            = 1170.28 sec*proc (8 tests)
rdas-bundle    = 1170.28 sec*proc (8 tests)
script         = 1170.28 sec*proc (8 tests)

Total Test time (real) = 211.26 sec

workdir: /work/noaa/wrfruc/rrfsbot/PRs_RDASApp/175

@rrfsbot
Copy link
Collaborator

rrfsbot commented Sep 26, 2024

started build_and_test on hera at UTC time: Thu Sep 26 20:15:57 UTC 2024
finished at UTC time: Thu Sep 26 20:52:50 UTC 2024

Test project /scratch1/NCEPDEV/fv3-cam/rrfsbot/PRs_RDASApp/175/build/rrfs-test
    Start 6: rrfs_mpasjedi_2024052700_getkf_observer
    Start 1: rrfs_fv3jedi_hyb_2022052619
    Start 2: rrfs_fv3jedi_letkf_2022052619
    Start 3: rrfs_mpasjedi_2024052700_Ens3Dvar
    Start 4: rrfs_mpasjedi_2024052700_letkf
    Start 5: rrfs_mpasjedi_2024052700_getkf
    Start 8: rrfs_mpasjedi_2024052700_bumploc
1/8 Test #2: rrfs_fv3jedi_letkf_2022052619 .............   Passed   34.83 sec
2/8 Test #6: rrfs_mpasjedi_2024052700_getkf_observer ...   Passed   45.65 sec
    Start 7: rrfs_mpasjedi_2024052700_getkf_solver
3/8 Test #5: rrfs_mpasjedi_2024052700_getkf ............   Passed   63.11 sec
4/8 Test #4: rrfs_mpasjedi_2024052700_letkf ............   Passed   67.77 sec
5/8 Test #3: rrfs_mpasjedi_2024052700_Ens3Dvar .........   Passed   85.17 sec
6/8 Test #1: rrfs_fv3jedi_hyb_2022052619 ...............   Passed   97.43 sec
7/8 Test #7: rrfs_mpasjedi_2024052700_getkf_solver .....   Passed   65.77 sec
8/8 Test #8: rrfs_mpasjedi_2024052700_bumploc ..........   Passed  161.80 sec

100% tests passed, 0 tests failed out of 8

Label Time Summary:
mpi            = 621.54 sec*proc (8 tests)
rdas-bundle    = 621.54 sec*proc (8 tests)
script         = 621.54 sec*proc (8 tests)

Total Test time (real) = 161.83 sec

workdir: /scratch1/NCEPDEV/fv3-cam/rrfsbot/PRs_RDASApp/175

@TingLei-NOAA
Copy link
Contributor

It is found the previously reported failures of fv3-jedi concerning me for missing files were generated for my running ctests in parallel mode while the independency between some fv3-jedi ctests are not labelled accordingly.
After re-running ctests in the serial mode and all fv3-jedi ctets passed.

@ShunLiu-NOAA
Copy link

@SamuelDegelia-NOAA @TingLei-NOAA, @guoqing-noaa and @rrfsbot, thank you for your effort. The PR is good to merge now.

@ShunLiu-NOAA ShunLiu-NOAA merged commit 4bcc3cc into NOAA-EMC:develop Sep 27, 2024
1 check passed
@SamuelDegelia-NOAA
Copy link
Contributor Author

Thanks @TingLei-NOAA for investigating!

@SamuelDegelia-NOAA SamuelDegelia-NOAA deleted the feature/split_getkf_test branch September 27, 2024 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants