Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct scattering index and enable error inflation in ATMS data, #812

Merged

Conversation

jianjunj
Copy link
Contributor

@jianjunj jianjunj commented Dec 5, 2024

Description

The scattering index was not correctly calculated by using a wrong channel at 89GHz. The error inflation referring to it was turned off instead of correcting the calculation. This PR corrects the scattering index and enables error inflation in ATMS observations.

Note, no test has been conducted. However, it very likely creates non-zero difference in GSI results.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

@jianjunj jianjunj self-assigned this Dec 5, 2024
@jianjunj jianjunj added the bug Something isn't working label Dec 5, 2024
Copy link
Contributor

@ADCollard ADCollard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An important bug fix. Thanks.

Copy link
Contributor

@emilyhcliu emilyhcliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Catch!

@RussTreadon-NOAA RussTreadon-NOAA self-assigned this Dec 9, 2024
@RussTreadon-NOAA
Copy link
Contributor

@jianjunj , please follow the GSI code management procedures on the GSI:-How-to-Make-Chhttps://github.com/NOAA-EMC/GSI/wiki/GSI:-How-to-Make-Changesanges for future GSI PRs. Changes to the GSI begin with opening an issue. Test results, both ctests and other necessary tests to validate the correctness of the change, are recorded in the issue (or PR).

You don't need to create an issue for this PR but please do so for future GSI changes. We do, however, need to see ctest and other validation results in this PR. ctests need to be run at a minimum on WCOSS2 and Hera. Running ctests on WCOSS2, Hera, Hercules, and Orion is preferable.

Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes to adhere to GSI code standards.

src/gsi/clw_mod.f90 Outdated Show resolved Hide resolved
src/gsi/clw_mod.f90 Outdated Show resolved Hide resolved
@RussTreadon-NOAA
Copy link
Contributor

WCOSS2 ctests

Install jianjunj:feature/atms_sacttering_index at fb65b29 as updat and develop at 9ada88e as contrl on Dogwood. Run ctests with the following results

Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr812/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  1689.87 sec
2/6 Test #6: global_enkf ......................   Passed  1753.82 sec
3/6 Test #5: hafs_3denvar_hybens ..............   Passed  2116.57 sec
4/6 Test #4: hafs_4denvar_glbens ..............   Passed  2237.49 sec
5/6 Test #2: rtma .............................   Passed  2403.79 sec
6/6 Test #1: global_4denvar ...................***Failed  4926.59 sec

83% tests passed, 1 tests failed out of 6

Total Test time (real) = 4926.82 sec

The global_4denvar failure is due to

The results (penalty) between the two runs are nonreproducible,
thus the regression test has Failed on cost for global_4denvar_loproc_updat and global_4denvar_loproc_contrl analyses.

This failure is expected. This PR corrects a bug in the ATMS scattering index and error inflation. Comparison of the contrl and updat fort.207 shows initial differences in the ATMS penalties. The contrl run has

o-g 01 rad   npp          atms           15399934         3454         2590    675.79       675.79      0.26092      0.26092    
o-g 01 rad   n20          atms           15399934         3432         2594    632.69       632.69      0.24391      0.24391    
o-g 01 rad   n21          atms           15399934         3432         2619    628.07       628.07      0.23981      0.23981    

whereas the updat run has

o-g 01 rad   npp          atms           15399934         3454         2590    673.24       673.24      0.25994      0.25994    
o-g 01 rad   n20          atms           15399934         3432         2594    618.16       618.16      0.23830      0.23830    
o-g 01 rad   n21          atms           15399934         3432         2619    621.66       621.66      0.23736      0.23736  

The number of assimilated ATMS observations is identical between the two gsi.x executables. The updat gsi.x has slightly smaller penalties. This reflects corrected error inflation and use of the correct scattering index.

Interestingly, the hafs_3denvar_hybens and rrfs_3denvar_rdasens tests also assimilate ATMS radiances but analysis results are identical between contrl and updat gsi.x. Presumably this is because regional DA does not exercise the code altered by this PR. Is this correct @jianjunj?

@jianjunj
Copy link
Contributor Author

jianjunj commented Dec 9, 2024

@RussTreadon-NOAA Thank you so much for running the ctests. I'll do these test in future PRs. In regards to the zero-differences in hafs_3denvar_hybens and rrfs_3denvar_rdasens tests, I think, it is because that this bug fix is not triggered in this test. It needs the "icw" column to be "1" in "satinfo.txt" to set lcloud_fwd=true and trigger this error inflation over "water" surface:
https://github.com/NOAA-EMC/GSI/pull/812/files#diff-fb4541d749f4c3618c014384b495bcb3c96136d35b21c049fd6c68d08ca47d1cR3664

@RussTreadon-NOAA
Copy link
Contributor

Thank you, @jianjunj, for confirming that the no difference results for hafs_3denvar_hybens and rrfs_3denvar_rdasens are expected and are correct.

Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update working copy of jianjunj:feature/atms_sacttering_index to
101e471. Recompile and rerun global_4denvar ctest. Test still failed (expected). Analysis results from the previous updat and 101e471 updat are identical. This is expected, but it's good to have confirmation.

Approve.

@RussTreadon-NOAA
Copy link
Contributor

@jianjunj , this PR can not move forward until ctests are run on at least Hera and preferably Hera and one of the MSU machines (Orion or Hercules).

@jianjunj
Copy link
Contributor Author

@RussTreadon-NOAA I totally understand.

@RussTreadon-NOAA
Copy link
Contributor

Install develop at 6fd4bd5 and jianjunj:feature/atms_sacttering_index at 03f4e72 on WCOSS2 (Dogwood), Hera, and Orion. Ctests launched on each machine. Results will be posted as ctests complete.

@RussTreadon-NOAA
Copy link
Contributor

WCOSS2 (Dogwood) ctests

Ctest results on Dogwood are as follows

Test project /lfs/h2/emc/da/noscrub/russ.treadon/git/gsi/pr812/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  847.98 sec
2/6 Test #6: global_enkf ......................   Passed  1032.32 sec
3/6 Test #5: hafs_3denvar_hybens ..............   Passed  1277.50 sec
4/6 Test #2: rtma .............................   Passed  1330.28 sec
5/6 Test #4: hafs_4denvar_glbens ..............   Passed  1461.74 sec
6/6 Test #1: global_4denvar ...................***Failed  1985.98 sec

83% tests passed, 1 tests failed out of 6

Total Test time (real) = 1985.99 sec

The following tests FAILED:
          1 - global_4denvar (Failed)

The global_4denvar test failed due to

The results (penalty) between the two runs are nonreproducible,
thus the regression test has Failed on cost for global_4denvar_loproc_updat and global_4denvar_loproc_contrl analyses.

Examination of the contrl (<) and updat (>) output show differences in the initial processing of ATMS observations

< o-g 01 rad   npp          atms           15399934         3454         2590    675.79       675.79      0.26092      0.26092    
< o-g 01 rad   n20          atms           15399934         3432         2594    632.69       632.69      0.24391      0.24391    
> o-g 01 rad   npp          atms           15399934         3454         2590    673.24       673.24      0.25994      0.25994    
> o-g 01 rad   n20          atms           15399934         3432         2594    618.16       618.16      0.23830      0.23830    
< o-g 01 rad   n21          atms           15399934         3432         2619    628.07       628.07      0.23981      0.23981    
> o-g 01 rad   n21          atms           15399934         3432         2619    621.66       621.66      0.23736      0.23736    

The amount of assimilated ATMS data between contrl and updat is the same but the penalties differ with updat penalties being smaller. This likely reflects ATMS updates in this PR. Is this a valid statement, @jianjunj?

@RussTreadon-NOAA
Copy link
Contributor

Orion ctests

Ctest results on Orion are as follows

Test project /work2/noaa/da/rtreadon/git/gsi/pr812/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #6: global_enkf ......................   Passed  787.83 sec
2/6 Test #3: rrfs_3denvar_rdasens .............   Passed  969.78 sec
3/6 Test #2: rtma .............................   Passed  1687.74 sec
4/6 Test #5: hafs_3denvar_hybens ..............   Passed  2903.80 sec
5/6 Test #4: hafs_4denvar_glbens ..............   Passed  3022.48 sec
6/6 Test #1: global_4denvar ...................***Failed  3602.80 sec

83% tests passed, 1 tests failed out of 6

Total Test time (real) = 3602.84 sec

The following tests FAILED:
          1 - global_4denvar (Failed)

The global_4denvar test failed due to

The results (penalty) between the two runs are nonreproducible,
thus the regression test has Failed on cost for global_4denvar_loproc_updat and global_4denvar_loproc_contrl analyses.

This is the same reason for the WCOSS2 global_4denvar failure. A check of the contrl and updat stdout shows that analysis differences originate with ATMS observations. Below is the difference between contrl (<) and updat (>) `fort.207 statistics for the initial innovations

< o-g 01 rad   npp          atms           15399934         3454         2590    675.79       675.79      0.26092      0.26092
< o-g 01 rad   n20          atms           15399934         3432         2594    632.69       632.69      0.24391      0.24391
> o-g 01 rad   npp          atms           15399934         3454         2590    673.24       673.24      0.25994      0.25994
> o-g 01 rad   n20          atms           15399934         3432         2594    618.16       618.16      0.23830      0.23830
< o-g 01 rad   n21          atms           15399934         3432         2619    628.07       628.07      0.23981      0.23981
> o-g 01 rad   n21          atms           15399934         3432         2619    621.66       621.66      0.23736      0.23736

This behavior is identical to what is observed on WCOSS2.

@RussTreadon-NOAA
Copy link
Contributor

Hera ctests

Ctest results on Hera are as follows

Test project /scratch1/NCEPDEV/da/Russ.Treadon/git/gsi/pr812/build
    Start 1: global_4denvar
    Start 2: rtma
    Start 3: rrfs_3denvar_rdasens
    Start 4: hafs_4denvar_glbens
    Start 5: hafs_3denvar_hybens
    Start 6: global_enkf
1/6 Test #3: rrfs_3denvar_rdasens .............   Passed  1332.60 sec
2/6 Test #6: global_enkf ......................   Passed  3751.26 sec
3/6 Test #5: hafs_3denvar_hybens ..............   Passed  4107.00 sec
4/6 Test #2: rtma .............................   Passed  4752.62 sec
5/6 Test #4: hafs_4denvar_glbens ..............   Passed  5311.49 sec
6/6 Test #1: global_4denvar ...................***Failed  5406.08 sec

83% tests passed, 1 tests failed out of 6

Total Test time (real) = 5406.09 sec

The following tests FAILED:
          1 - global_4denvar (Failed)

The global_4denvar test failed due to

The results (penalty) between the two runs are nonreproducible,
thus the regression test has Failed on cost for global_4denvar_loproc_updat and global_4denvar_loproc_contrl analyses.

The fort.207 differences between contrl (<) and updat (>) are the same as what is observed on WCOSS2 and Orion

< o-g 01 rad   npp          atms           15399934         3454         2590    675.79       675.79      0.26092      0.26092
< o-g 01 rad   n20          atms           15399934         3432         2594    632.69       632.69      0.24391      0.24391
> o-g 01 rad   npp          atms           15399934         3454         2590    673.24       673.24      0.25994      0.25994
> o-g 01 rad   n20          atms           15399934         3432         2594    618.16       618.16      0.23830      0.23830
< o-g 01 rad   n21          atms           15399934         3432         2619    628.07       628.07      0.23981      0.23981
> o-g 01 rad   n21          atms           15399934         3432         2619    621.66       621.66      0.23736      0.23736

@RussTreadon-NOAA
Copy link
Contributor

@jianjunj , the following has been done

  1. jianjunj:feature/atms_sacttering_index updated with recent changes from develop. Done at 03f4e72
  2. jianjunj:feature/atms_sacttering_index installed on WCOSS2 (Dogwood), Hera, and Orion
  3. ctests run on each machine and results reported in this PR.

If you can confirm that the observed ctest behavior is acceptable, we can move this PR forward.

@jianjunj
Copy link
Contributor Author

@RussTreadon-NOAA Thank you very much fort doing the tests! Yes, it is expected to have slightly larger ATMS observational error due to the inflation. @ADCollard @emilyhcliu

@RussTreadon-NOAA RussTreadon-NOAA merged commit 0921251 into NOAA-EMC:develop Dec 12, 2024
2 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants