-
Notifications
You must be signed in to change notification settings - Fork 30
Developers' Corner
Setting the top level of options. Users can preempt some settings here.
NOTE: variables only needed in the workflow setup step and NOT needed in runtime will be dropped in the final copy of exp.setup under EXPDIR.
2.1 The NCO standards has defined a set of variables. We should not change the meaning and usage of these variables and should not create new variables for anything covered by the NCO standard variables.
We try to understand these variables from our perspective. Correct us if not right
NET
: A reserved variable in the NCO implementation standards, it is rrfs
for the RRFS system and rtma
for the 3DRTMA system.
MESH_NAME
: different domain, grid configurations of a model, such as conus12km, conus3km, atl12km, etc. MESH_NAME
is a keyword in rrfs-workflow, it is fixed but we have a few choices when setting up an experiment.
RUN
: the top-level subcomponents of a system, for example, for GFS, RUN
can be gfs
, gdas
, enkfgdas
, ocean
, etc. For RRFS, we may have rrfs
, enkfrrfs
, firewx
, etc. This is set automatically in runtime, users don't configure it.
PDY
: present day in YYYYMMDD
format
cyc
: cycle time in GMT hours, formatted HH
DATAROOT
: Directory containing the working directory, typically $OPSROOT/tmp
in production. This is usually set as the stmp
directory in the development machines.
DATA
: Location of the job working directory, typically $DATAROOT/$jobid
. In the development configuration, DATA
can be tweaked to other locations through util/sideload/launch.sh
. But in J-job and ex-scripts, we should use this variable to access the current working directory for a task in runtime. DATA
is removed immediately following the completion of a task in operation but at the development stage, users can set KEEPDATA=YES
to keep DATA
directories
HOMErrfs
, USHrrfs
, EXECrrfs
, PARMrrfs
, SCRIPTSrrfs
, FIXrrfs
: locations of the rrfs system directories
COMROOT
: root directory for input/output data for the rrfs system.
COMIN
: directory for the current model's input data, typically, $COMROOT/$NET/$model_ver/$RUN.$PDY
COMOUT
: directory for current model's output data, typically, $COMROOT/$NET/$model_ver/$RUN.$PDY
COMINgfs
: incoming data from GFS
COMINrap
: incoming data from RAP
COMINgefs
: incoming data from GEFS
KEEPDATA
: specifies whether or not the working directory should be kept upon the completion of a job.
EXPDIR
: The experiment directory which contains rrfs.xml
, exp.setup
and a config/
directory
VERSION
: The system version number
MACHINE
: The machine name, such as jet``hera
, orion
, hercules
, all in lower cases
TAG
: Right now, it is only used as a prefix to slurm job names.
CDATE
: Current cycle in YYYYMMDDHH
format
OBSINprepbufr
: the location of incoming prepbufr files
ENS_INDEX
: the ensemble index of a task. If this variable exists in runtime, it means this run is for one of the ensembles.
If FCST
_ONLY is exported before this line, it will take the predefined values; otherwise, it will take a default value of false
.
CDATE="2024052700"
PDY=${CDATE:0:8} # eg: 20240527
cyc=${CDATE:8:2} # eg: 00
NDATE
should be used if one wants to find the previous or next few cycles. This is more robust than coding a date
script individually.
source ${HOMErrfs}/workflow/ush/load_prod_util.sh
${NDATE} 003 2024052700 # result: 2024052703
${NDATE} -003 2024052700 # result: 2024052621
For differences between the operation and develop part, check this link
To accommodate those differences, the following measures are considered:
4.1. side loading for non-NCO tasks, such as clean, archive, graphics. They don't need J-jobs/ex-scripts and will be put under util/sideload
.
4.2. The rocoto workflow management software does not provide some job card variables as ecflow. To compensate for this, a workflow/sideload/launch.sh
script is added to mimic the ecflow behavior and provide a switch which routes a task to either J-jobs or non-NCO tasks
4.3. use ${cpreq}
to copy files/directories that are required for a job to function. In lots of situations, soft links work better at the develop stage, so the following line is added in workflow/sideload/launch.sh
to tweak the cpreq
command for develop
:
export cpreq="ln -snf"
4.4 Use links to manage fix files (more detailed information here. In NCO implementation, do something as follows:
cp -rpL fix fix2; rm -rf fix; mv fix2 fix
will make a hard copy of fix files needed for operation
4.5. In order to separate concerns and only export required environmental variables for a task at runtime, a cascade config structure will be adopted. Resource configuration (such as ACCOUNT, QUEUE, PARTITION, RESERVATION, NODES, WALLTIME, NATIVE, MEMORY
etc) are only needed in the experiment setup process and will be separated from the runtime configuration. More detailed information here.
4.6. exp.setup
or similar files under workflow/exp
will be used to set up top-level variables for an experiment, such as directories, VERSION, TAG
, days if it is a realtime run or retro period if it is a retro run. Users can also preempt some environmental variables here. These files are to facilitate quickly setting up a develop experiment (retro or realtime, different machines and different grids/resolutions). These files are not needed in operation
4.7. The core of the workflow will only consider the NCO naming convention for all existing operational model products (such as gfs products, etc). However, the workflow will provide example link utilities under workflow/ush
to use hard or soft links to convert users' specific naming conventions to match the NCO standard.