Skip to content

Commit

Permalink
updates for documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
mvertens committed Aug 11, 2020
1 parent 339112c commit 6a35dea
Show file tree
Hide file tree
Showing 6 changed files with 873 additions and 7 deletions.
2 changes: 1 addition & 1 deletion doc/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
SPHINXPROJ = CMEPS
SPHINXPROJ = CDEPS
SOURCEDIR = source
BUILDDIR = build

Expand Down
126 changes: 126 additions & 0 deletions doc/source/datm.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
.. _datm-datamodes:

Data Atmosphere (DATM)
======================

DATM is normally used to provide observational forcing data (or
forcing data produced by a previous run using active components) to
drive prognostic components. The various ways of running DATM is referred to as its mode.

In the case of CESM, these would be: CTSM, POP2, MOM6, POP2/CICE5-6
and MOM6/CICE5-6. As examples, CORE2_NYF (CORE2 normal year forcing)
is the DATM mode used in driving POP2 and MOM6. On the other hand
CLM_QIAN, CLMCRUNCEP, CLMGSWP3 and CLM1PT are DATM modes using
observational data for forcing CTSM.

.. _datm-datamodes:

--------------------
datamode values
--------------------

DATM its own set of supported ``datamode`` values that appears in the
``datm_in`` namelist input. The datamode specifies what additional
operations need to be done by DATM on *ALL* of the streams in the
``datm.streams.xml`` file. Each datamode value is also associated
with a DATM source file that carries out these operations and these are
listed in parentheses next to the mode name.

CLMNCEP (``datm_datamode_clmncep_mod.F90``)
- In conjunction with NCEP climatological atmosphere data, provides
the atmosphere forcing favored by the Land Model Working Group when
coupling an active land model with observed atmospheric
forcing. This mode replicates code previously found in CLM (circa
2005), before the LMWG started using the CIME coupling
infrastructure and data models to do active-land-only simulations."

CORE2_NYF (``datm_datamode_core2_mod.F90``)
- Coordinated Ocean-ice Reference Experiments (CORE) Version 2 Normal Year Forcing."

CORE2_IAF (``datm_datamode_core2_mod.F90``)
- In conjunction with CORE Version 2 atmospheric forcing data,
provides the atmosphere forcing when coupling an active ocean model
with observed atmospheric forcing. This mode and associated data
sets implement the CORE-IAF Version 2 forcing data, as developed by
Large and Yeager (2008) at NCAR. Note that CORE2_NYF and CORE2_IAF
work exactly the same way.

CORE_IAF_JRA (``datm_datamode_jra_mod.F90``)
- In conjunction with JRA-55 Project, provides the atmosphere forcing
when coupling an active ocean model with observed atmospheric
forcing. This mode and associated data sets implement the JRA-55
v1.3 forcing data."

ERA5 (``datm_datamode_era5_mod.F90``)
- Fifth generation ECMWF atmospheric reanalysis of the global climate

.. _datm-cime-vars:

---------------------------------------
Configuring DATM from CIME
---------------------------------------

If CDEPS is coupled to the CIME-CCS then the CIME ``$CASEMROOT`` xml
variable ``DATM_MODE`` sets the collection of streams the streams that
are associated with DATM and also sets the datm namelist variable
``datamode`` in the file ``datm_in``. The following are the supported
DATM ``datamode`` values, as defined in the file
``namelist_definition_datm.xml``.

The following table describes the valid values of ``DATM_MODE``
(defined in the ``config_component.xml`` file for DATM), and how they
relate to the associated input streams and the ``datamode`` namelist
variable. CIME will generate a value of ``DATM_MODE`` based on the
compset.

CORE2_NYF,
- CORE2 normal year forcing (CESM C ang G compsets)
- streams: CORE2_NYF.GISS,CORE2_NYF.GXGXS,CORE2_NYF.NCEP
- datamode: CORE2_NYF

CORE2_IAF
- CORE2 interannual year forcing (CESM C ang G compsets)
- streams: CORE2_IAF.GCGCS.PREC,CORE2_IAF.GISS.LWDN,
CORE2_IAF.GISS.SWDN,CORE2_IAF.GISS.SWUP,
CORE2_IAF.NCEP.DN10,CORE2_IAF.NCEP.Q_10,
CORE2_IAF.NCEP.SLP_,CORE2_IAF.NCEP.T_10,CORE2_IAF.NCEP.U_10,
CORE2_IAF.NCEP.V_10,CORE2_IAF.CORE2.ArcFactor
- datamode: CORE2_IAF

CORE_IAF_JRA
- JRA-55 intra-annual year forcing (CESM C ang G compsets)
- streams: CORE_IAF_JRA.PREC,CORE_IAF_JRA.LWDN,CORE_IAF_JRA.SWDN,
CORE_IAF_JRA.Q_10,CORE_IAF_JRA.SLP_,CORE_IAF_JRA.T_10,CORE_IAF_JRA.U_10,
CORE_IAF_JRA.V_10,CORE_IAF_JRA.CORE2.ArcFactor
- datamode: CORE_IAF_JRA

CLM_QIAN_WISO
- QIAN atm input data with water isotopes (CESM I compsets)
- streams: CLM_QIAN_WISO.Solar,CLM_QIAN_WISO.Precip,CLM_QIAN_WISO.TPQW
- datamode: CLMNCEP

CLM_QIAN
- QIAN atm input data (CESM I compsets)
- streams: CLM_QIAN.Solar,CLM_QIAN.Precip,CLM_QIAN.TPQW
- datamode: CLMNCEP

CLMCRUNCEPv7
- CRUNCEP atm input data (CESM I compsets)
- streams: CLMCRUNCEP.Solar,CLMCRUNCEP.Precip,CLMCRUNCEP.TPQW
- datamode: CLMNCEP

CLMGSWP3
- GSWP3 atm input data (I compsets)
- streams: CLMGSWP3.Solar,CLMGSWP3.Precip,CLMGSWP3.TPQW
- datamode: CLMNCEP

CLM1PT
- single point tower site atm input data
- streams: CLM1PT.$ATM_GRID
- datamode: CLMNCEP

CPLHIST
- user generated forcing data from using coupler history files
used to spinup relevant prognostic components (for CESM this is CLM, POP and CISM)
- streams: CPLHISTForcing.Solar,CPLHISTForcing.nonSolarFlux,
- datamode: CPLHIST
210 changes: 210 additions & 0 deletions doc/source/design_details.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
.. _design-details:

================
Design Details
================

----------------------
Data Model Performance
----------------------

There are two primary costs associated with CDEPS share code: reading data and spatially mapping data.
Time interpolation is relatively cheap in the current implementation.
As much as possible, redundant operations are minimized.
The upper and lower bound mapped input data is saved between time steps to reduce mapping costs in cases where data is time interpolated more often than new data is read.
If the input data timestep is relatively small (for example, hourly data as opposed to daily or monthly data) the cost of reading input data can be quite large.
Also, there can be significant variation in cost of the data model over the coarse of the run, for instance, when new inputdata must be read and interpolated, although it's relatively predictable.
The present implementation doesn't support changing the order of operations, for instance, time interpolating the data before spatial mapping.
Because the present computations are always linear, changing the order of operations will not fundamentally change the results.
The present order of operations generally minimizes the mapping cost for typical data model use cases.

----------------------
IO Through Data Models
----------------------

At the present time, data models can only read netcdf data, and IO is handled through the PIO library using either netcdf or pnetcdf.
PIO can read the data either serially or in parallel in chunks that are approximately the global field size divided by the number of IO tasks.
If pnetcdf is used through PIO, then the pnetcdf library must be included during the build of the model.

----------------------------------
IO Through Data Models In CIME-CCS
----------------------------------

If CDEPS is used in CIME, the pnetcdf path and option is hardwired
into the ``Macros.make`` file for the specific machine. To turn on
``pnetcdf`` in the build, make sure the ``Macros.make`` variables
``PNETCDF_PATH``, ``INC_PNETCDF``, and ``LIB_PNETCDF`` are set and
that the PIO ``CONFIG_ARGS`` sets the ``PNETCDF_PATH`` argument.
Beyond just the option of selecting IO with PIO, several namelist variables are available to help optimize PIO IO performance.
Those are **TODO** - list these.
The total mpi tasks that can be used for IO is limited to the total number of tasks used by the data model.
Often though, using fewer IO tasks results in improved performance.
In general, [io_root + (num_iotasks-1)*io_stride + 1] has to be less than the total number of data model tasks.
In practice, PIO seems to perform optimally somewhere between the extremes of 1 task and all tasks, and is highly machine and problem dependent.

-------------
Restart Files
-------------
Restart files are generated automatically by the data models based on an attribute flag received in the NUOPC cap.
The restart files must meet the CIME-CCS naming convention and an ``rpointer`` file is generated at the same time.
An ``rpointer`` file is a *restart pointer* file which contains the name of the most recently created restart file.
Normally, if restart files are read, the restart filenames are specified in the ``rpointer`` file.
Optionally though, there are namelist variables such as ``restfilm`` to specify the restart filenames via namelist. If those namelist variables are set, the ``rpointer`` file will be ignored.

In most cases, no restart file is required for the data models to restart exactly.
This is because there is no memory between timesteps in many of the data model science modes.
If a restart file is required, it will be written automatically and then must be used to continue the previous run.

There are separate stream restart files that only exist for
performance reasons. A stream restart file contains information about
the time axis of the input streams. This information helps reduce the
startup costs associated with reading the input dataset time axis
information. If a stream restart file is missing, the code will
restart without it but may need to reread data from the input data
files that would have been stored in the stream restart file. This
will take extra time but will not impact the results.

.. _data-structures:

---------------
Stream Modules
---------------

The CDEPS stream code contains four modules:

**dshr_strdata_mod.F90**
Carries out stream IO along with the spatial and
temporal interpolation of the stream data to the model mesh and
model time. Initializes the module data type ``shr_strdata_type``.

**dshr_stream_mod.F90**
Reads in the stream xml file and returns the upper and
lower bounds of the stream data. Initializes the module data type
``shr_stream_streamType``.

**dshr_tinterp_mod.F90**
Determines the time interpolation factors.

**dshr_methods_mod.F90**
Wrappers to ESMF such as getting a pointer to a field in a field bundle, etc.

----------------
Stream Datatypes
----------------

The most basic type, ``shr_stream_fileType`` is contained in
``shr_stream_mod.F90`` and specifies basic information related to a
given stream file.

.. code-block:: Fortran
type shr_stream_fileType
character(SHR_KIND_CL) :: name = shr_stream_file_null ! the file name
logical :: haveData = .false. ! has t-coord data been read in?
integer (SHR_KIND_IN) :: nt = 0 ! size of time dimension
integer (SHR_KIND_IN),allocatable :: date(:) ! t-coord date: yyyymmdd
integer (SHR_KIND_IN),allocatable :: secs(:) ! t-coord secs: elapsed on date
end type shr_stream_fileType
The following type, ``shr_stream_streamType`` contains information
that encapsulates the information related to all files specific to a
target stream. (see the overview of the :ref:`stream_description_file`).

.. code-block:: Fortran
type shr_stream_streamType
!private ! no public access to internal components
integer :: logunit ! stdout log unit
type(iosystem_desc_t), pointer :: pio_subsystem
integer :: pio_iotype
integer :: pio_ioformat
logical :: init = .false. ! has stream been initialized
integer :: nFiles = 0 ! number of data files
integer :: yearFirst = -1 ! first year to use in t-axis (yyyymmdd)
integer :: yearLast = -1 ! last year to use in t-axis (yyyymmdd)
integer :: yearAlign = -1 ! align yearFirst with this model year
character(CS) :: taxMode = shr_stream_taxis_cycle ! cycling option for time axis
character(CS) :: tInterpAlgo = 'linear' ! algorithm to use for time interpolation
character(CS) :: mapalgo = 'bilinear' ! type of mapping - default is 'bilinear'
character(CS) :: readMode = 'single' ! stream read model - 'single' or 'full_file'
real(r8) :: dtlimit = 1.5_r8 ! delta time ratio limits for time interpolation
integer :: offset = 0 ! offset in seconds of stream data
character(CS) :: calendar = shr_cal_noleap ! stream calendar (obtained from first stream data file)
character(CL) :: meshFile = ' ' ! filename for mesh for all fields on stream (full pathname)
integer :: k_lvd = -1 ! file/sample of least valid date
integer :: n_lvd = -1 ! file/sample of least valid date
logical :: found_lvd = .false. ! T <=> k_lvd,n_lvd have been set
integer :: k_gvd = -1 ! file/sample of greatest valid date
integer :: n_gvd = -1 ! file/sample of greatest valid date
logical :: found_gvd = .false. ! T <=> k_gvd,n_gvd have been set
logical :: fileopen = .false. ! is current file open
character(CL) :: currfile = ' ' ! current filename
integer :: nvars ! number of stream variables
character(CL) :: stream_vectors ! stream vectors names
type(file_desc_t) :: currpioid ! current pio file desc
type(shr_stream_file_type) , allocatable :: file(:) ! filenames of stream data files (full pathname)
type(shr_stream_data_variable), allocatable :: varlist(:) ! stream variable names (on file and in model)
end type shr_stream_streamType
Finally, the datatypes ``shr_strdata_per_stream`` and
``shr_strdata_type`` in ``dshr_strdata_mod.F90`` are at the heart
of the CDEPS stream code and contains information for
all the streams that are active for the target data model.

.. code-block:: Fortran
type shr_strdata_perstream
character(CL) :: stream_meshfile ! stream mesh file from stream txt file
type(ESMF_Mesh) :: stream_mesh ! stream mesh created from stream mesh file
type(io_desc_t) :: stream_pio_iodesc ! stream pio descriptor
logical :: stream_pio_iodesc_set =.false. ! true=>pio iodesc has been set
type(ESMF_RouteHandle) :: routehandle ! stream n -> model mesh mapping
character(CL), allocatable :: fldlist_stream(:) ! names of stream file fields
character(CL), allocatable :: fldlist_model(:) ! names of stream model fields
integer :: stream_lb ! index of the Lowerbound (LB) in fldlist_stream
integer :: stream_ub ! index of the Upperbound (UB) in fldlist_stream
type(ESMF_Field) :: field_stream ! a field on the stream data domain
type(ESMF_Field) :: stream_vector ! a vector field on the stream data domain
type(ESMF_FieldBundle), allocatable :: fldbun_data(:) ! stream field bundle interpolated to model grid
type(ESMF_FieldBundle) :: fldbun_model ! stream n field bundle interpolated to model grid and time
integer :: ucomp = -1 ! index of vector u in stream
integer :: vcomp = -1 ! index of vector v in stream
integer :: ymdLB = -1 ! stream ymd lower bound
integer :: todLB = -1 ! stream tod lower bound
integer :: ymdUB = -1 ! stream ymd upper bound
integer :: todUB = -1 ! stream tod upper bound
real(r8) :: dtmin = 1.0e30_r8
real(r8) :: dtmax = 0.0_r8
type(ESMF_Field) :: field_coszen ! needed for coszen time interp
end type shr_strdata_perstream
.. code-block:: Fortran
type shr_strdata_type
type(shr_strdata_perstream), allocatable :: pstrm(:) ! stream info
type(shr_stream_streamType), pointer :: stream(:)=> null() ! stream datatype
integer :: nvectors ! number of vectors
logical :: masterproc
integer :: logunit ! stdout unit
integer :: io_type ! pio info
integer :: io_format ! pio info
integer :: modeldt = 0 ! model dt in seconds
type(ESMF_Mesh) :: model_mesh ! model mesh
real(r8), pointer :: model_lon(:) => null() ! model longitudes
real(r8), pointer :: model_lat(:) => null() ! model latitudes
integer :: model_nxg ! model global domain lon size
integer :: model_nyg ! model global domain lat size
integer :: model_nzg ! model global domain vertical size
integer :: model_lsize ! model local domain size
integer, pointer :: model_gindex(:) ! model global index spzce
integer :: model_gsize ! model global domain size
type(ESMF_CLock) :: model_clock ! model clock
character(CL) :: model_calendar = shr_cal_noleap ! model calendar for ymd,tod
integer :: ymd, tod ! model time
type(iosystem_desc_t), pointer :: pio_subsystem => null() ! pio info
real(r8) :: eccen = SHR_ORB_UNDEF_REAL ! cosz t-interp info
real(r8) :: mvelpp = SHR_ORB_UNDEF_REAL ! cosz t-interp info
real(r8) :: lambm0 = SHR_ORB_UNDEF_REAL ! cosz t-interp info
real(r8) :: obliqr = SHR_ORB_UNDEF_REAL ! cosz t-interp info
real(r8), allocatable :: tavCoszen(:) ! cosz t-interp data
end type shr_strdata_type
16 changes: 11 additions & 5 deletions doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,14 @@
CDEPS documentation
===================
The Community Data Models for Earth Prediction Systems (CMEPS) is a
NUOPC-compliant Mediator component used for coupling Earth system
model components. It is currently being used in NCAR's Community
Earth System Model (CESM) and NOAA's subseasonal-to-seasonal
coupled system.

The Community Data Models for Earth Predictive Systems (CDEPS)
contains a set of NUOPC-compliant data components along with
ESMF-based share code that enables new capabilities in selectively
removing feedbacks in coupled model systems. The CDEPS data
models perform the basic function of reading external data files,
modifying those data, and then sending the data back to the CMEPS
mediator.

Table of contents
-----------------
Expand All @@ -18,3 +21,6 @@ Table of contents
:numbered:

introduction.rst
streams.rst
design_details.rst
datm.rst
Loading

0 comments on commit 6a35dea

Please sign in to comment.