Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitting up the archive task #3242

Open
wants to merge 116 commits into
base: develop
Choose a base branch
from

Conversation

AntonMFernando-NOAA
Copy link
Contributor

@AntonMFernando-NOAA AntonMFernando-NOAA commented Jan 21, 2025

Description

  • In this PR the archive task will be split into two parts. The first will always run and will just be for the copying of verification data to the VRFY_ARC and ARCDIR directories. The second will only run when HPSSARCH or LOCALARCH is set to YES and will generate and store tarballs in the ATARDIR, either on HPSS or locally.

  • Resolves Split up the archive task #3152

Type of change

  • Bug fix (fixes something broken)
  • New feature (adds functionality)
  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? NO (If YES, please add a link to any PRs that are pending.)
    • EMC verif-global
    • GDAS
    • GFS-utils
    • GSI
    • GSI-monitor
    • GSI-utils
    • UFS-utils
    • UFS-weather-model
    • wxflow

How has this been tested?

  • CI tests in Hera

Example:

  • Cycled test on Hera
  • Forecast-only on Hera

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added
  • Any new scripts have been added to the .github/CODEOWNERS file with owners
  • I have made corresponding changes to the system documentation if necessary

@emcbot
Copy link

emcbot commented Jan 28, 2025

Experiment C96_atm3DVar FAILED on Hera in Build# 3 in
/scratch1/NCEPDEV/global/CI/3242/RUNTESTS/EXPDIR/C96_atm3DVar_b2bf61c8

@emcbot
Copy link

emcbot commented Jan 28, 2025

Experiment C96C48_ufs_hybatmDA FAILED on Hera in Build# 3 with error logs:

/scratch1/NCEPDEV/global/CI/3242/RUNTESTS/COMROOT/C96C48_ufs_hybatmDA_b2bf61c8/logs/2024022400/gfs_arch_tars.log

Follow link here to view the contents of the above file(s): (link)

@emcbot
Copy link

emcbot commented Jan 28, 2025

Experiment C96C48_ufs_hybatmDA FAILED on Hera in Build# 3 in
/scratch1/NCEPDEV/global/CI/3242/RUNTESTS/EXPDIR/C96C48_ufs_hybatmDA_b2bf61c8

@emcbot
Copy link

emcbot commented Jan 28, 2025

Experiment C48_S2SW FAILED on Hera in Build# 3 with error logs:

/scratch1/NCEPDEV/global/CI/3242/RUNTESTS/COMROOT/C48_S2SW_b2bf61c8/logs/2021032312/gfs_arch_tars.log

Follow link here to view the contents of the above file(s): (link)

@emcbot
Copy link

emcbot commented Jan 28, 2025

Experiment C48_S2SW FAILED on Hera in Build# 3 in
/scratch1/NCEPDEV/global/CI/3242/RUNTESTS/EXPDIR/C48_S2SW_b2bf61c8

@emcbot emcbot added CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed and removed CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed labels Jan 28, 2025
@emcbot
Copy link

emcbot commented Jan 28, 2025

CI Failed on Hera in Build# 3
Built and ran in directory /scratch1/NCEPDEV/global/CI/3242


Experiment C48_ATM_b2bf61c8 Completed 1 Cycles: *SUCCESS* at Tue Jan 28 19:26:49 UTC 2025
Experiment C48mx500_3DVarAOWCDA_b2bf61c8 Completed 2 Cycles: *SUCCESS* at Tue Jan 28 19:38:58 UTC 2025
Experiment C48mx500_hybAOWCDA_b2bf61c8 Completed 2 Cycles: *SUCCESS* at Tue Jan 28 19:45:03 UTC 2025
Experiment C96_atm3DVar_b2bf61c8 Terminated with 0
FAIL
FAIL tasks failed and 1 dead at Tue Jan 28 20:33:47 UTC 2025
Experiment C96_atm3DVar_b2bf61c8 Terminated: *FAIL*
Error logs:
/scratch1/NCEPDEV/global/CI/3242/RUNTESTS/COMROOT/C96_atm3DVar_b2bf61c8/logs/2021122100/gfs_arch_tars.log
Experiment C96C48_hybatmDA_b2bf61c8 Completed 3 Cycles: *SUCCESS* at Tue Jan 28 20:33:54 UTC 2025
Experiment C96C48_hybatmaerosnowDA_b2bf61c8 Completed 3 Cycles: *SUCCESS* at Tue Jan 28 20:58:14 UTC 2025
Experiment C96C48_ufs_hybatmDA_b2bf61c8 Terminated with 0
FAIL
FAIL tasks failed and 1 dead at Tue Jan 28 21:06:50 UTC 2025
Experiment C96C48_ufs_hybatmDA_b2bf61c8 Terminated: *FAIL*
Error logs:
/scratch1/NCEPDEV/global/CI/3242/RUNTESTS/COMROOT/C96C48_ufs_hybatmDA_b2bf61c8/logs/2024022400/gfs_arch_tars.log
Experiment C48_S2SW_b2bf61c8 Terminated with 0
FAIL
FAIL tasks failed and 1 dead at Tue Jan 28 21:16:24 UTC 2025
Experiment C48_S2SW_b2bf61c8 Terminated: *FAIL*
Error logs:
/scratch1/NCEPDEV/global/CI/3242/RUNTESTS/COMROOT/C48_S2SW_b2bf61c8/logs/2021032312/gfs_arch_tars.log

@WalterKolczynski-NOAA WalterKolczynski-NOAA added CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera and removed CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed labels Jan 29, 2025
@emcbot emcbot added CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress and removed CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera labels Jan 29, 2025
@emcbot
Copy link

emcbot commented Jan 29, 2025

Experiment C96_atm3DVar FAILED on Hera in Build# 4 with error logs:

/scratch1/NCEPDEV/global/CI/3242/RUNTESTS/COMROOT/C96_atm3DVar_b4eee81f/logs/2021122100/gfs_arch_tars.log

Follow link here to view the contents of the above file(s): (link)

@emcbot emcbot added CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed and removed CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress labels Jan 29, 2025
@emcbot
Copy link

emcbot commented Jan 29, 2025

Experiment C96_atm3DVar FAILED on Hera in Build# 4 in
/scratch1/NCEPDEV/global/CI/3242/RUNTESTS/EXPDIR/C96_atm3DVar_b4eee81f

@emcbot emcbot added CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed and removed CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed labels Jan 29, 2025
@emcbot
Copy link

emcbot commented Jan 29, 2025

CI Failed on Hera in Build# 4
Built and ran in directory /scratch1/NCEPDEV/global/CI/3242


Experiment C48_ATM_b4eee81f Completed 1 Cycles: *SUCCESS* at Wed Jan 29 09:29:16 UTC 2025
Experiment C96_S2SWA_gefs_replay_ics_b4eee81f Completed 1 Cycles: *SUCCESS* at Wed Jan 29 09:35:25 UTC 2025
Experiment C48mx500_hybAOWCDA_b4eee81f Completed 2 Cycles: *SUCCESS* at Wed Jan 29 09:47:31 UTC 2025
Experiment C96_atm3DVar_b4eee81f Terminated with 0
FAIL
FAIL tasks failed and 1 dead at Wed Jan 29 10:42:30 UTC 2025
Experiment C96_atm3DVar_b4eee81f Terminated: *FAIL*
Error logs:
/scratch1/NCEPDEV/global/CI/3242/RUNTESTS/COMROOT/C96_atm3DVar_b4eee81f/logs/2021122100/gfs_arch_tars.log
Experiment C96C48_hybatmDA_b4eee81f Completed 3 Cycles: *SUCCESS* at Wed Jan 29 10:42:38 UTC 2025
Experiment C48_S2SW_b4eee81f Completed 1 Cycles: *SUCCESS* at Wed Jan 29 11:12:57 UTC 2025
Experiment C96C48_hybatmaerosnowDA_b4eee81f Completed 3 Cycles: *SUCCESS* at Wed Jan 29 11:13:14 UTC 2025
Experiment C48_S2SWA_gefs_b4eee81f Completed 1 Cycles: *SUCCESS* at Wed Jan 29 11:32:13 UTC 2025
Experiment C96C48_ufs_hybatmDA_b4eee81f Completed 3 Cycles: *SUCCESS* at Wed Jan 29 11:50:07 UTC 2025
Experiment C48mx500_3DVarAOWCDA_b4eee81f Completed 2 Cycles: *SUCCESS* at Wed Jan 29 12:14:48 UTC 2025

@DavidHuber-NOAA
Copy link
Contributor

The gfs_arch_tars job failed for the C96_atm3DVar test on Hera twice due to log file sizes changing during execution of htar. The first failure was for the log file gfs_arch_vrfy.log and the second for gfs_metpg2g1.log.

First, the parm/archive/gfsa.yaml.j2 file needs to be amended here

{% if not "gfs_arch.log" in log %}

to screen out gfs_arch_tars.log instead of gfs_arch.log.

Next, I think the dependencies need to be reworked for the gfs_arch_tars job so that it runs after gfs_arch_vrfy and the gfs_metp metatask. I could be convinced that the gfs_metp* tasks do not need to run before gfs_arch_tars, but then the screening in gfsa.yaml.j2 will need to also exclude gfs_metp*.log files.

@AntonMFernando-NOAA
Copy link
Contributor Author

@WalterKolczynski-NOAA @DavidHuber-NOAA C96_atm3DVar test passed with new changes.
test: /scratch1/NCEPDEV/global/Anton.Fernando/RUNTESTS/EXPDIR/gefs_C96_atm3DVar

Comment on lines +1900 to +1902
if self.options['do_archtar']:
dep_dict = {'type': 'task', 'name': f'{self.run}_arch_tars'}
deps.append(rocoto.add_dependency(dep_dict))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The arch_tars job shouldn't be a dependency for the metp jobs. I don't think you have to rerun your local test after you fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Split up the archive task
4 participants