REDOC II.2 - PRISM coupler
06/05/2002 - Version 1.0
S. Valcke, CERFACS
Summary
After introducing some general coupling concepts, the following document
gives a list of possible requirements and design options for the PRISM
coupler. These requirements will clearly not be all fulfilled and these
options will clearly not be all implemented in the future PRISM coupler.
This exhaustive list of possible requirements and options will help the
coupler developers to identify the relevant functionalities for the future
PRISM coupler and to establish a list of priorities for the next 3-year
developments. A review of existing couplers and coupling applications is
then presented in the last section.
Outline
Part I. Introduction - Definitions
1. Static, dynamic, or interactive coupled simulation
2. Possible coupling relations between two components
Part II. Possible requirements and design options for
the PRISM coupler
1. General requirements
2. Driver requirements and design options
3. Transformer requirements and design options
3.1 List of possible transformations
and other requirements
3.2 Design options for the Transformer
location
3.3 Design options for the Transformer
parallelisation
4. PRISM System Model Interface Library & Data Exchange
library requirements and design options
4.1 PSMILe general requirements
4.2. Data Exchange Library requirements
4.3 An important design option: a common
Model Interface Library for I/O and coupling data
Part III. Coupler review
1. OASIS
2. Palm
3. MpCCI
4. Calcium
5. CCSM Coupler 6
6. Distributed Data Broker
7. Flexible Modeling System
8. Coumehy
Annexe I - Technical advantages and disadvantages of
a dynamic Driver
Part I. Introduction - Definitions
This section introduces some general coupling concepts. The nature of
a coupled simulation is first discussed; it can be static, dynamic, or
interactive. The possible coupling relations between two component models
is then analysed.
1. Static, dynamic, or interactive coupled simulation
An important concept relates to the possibility given, or not, to the
coupling parameters to evolve during the coupled simulation. The coupling
parameters include:
-
the component models
-
the characteristics of the coupling exchanges (fields, frequencies, transformations,
etc.)
-
the characteristics of the coupling fields themselves (units, grid, partitioning,
etc.).
Different options can be defined: a coupled simulation can be static,
dynamic
or interactive (with respect to the process management, or with respect
to the coupling exchange characteristics, or with respect to the coupling
field characteristics).
All coupling parameters are fixed initially and do not change during
the whole simulation. All information given by the models (coupling field
units, grid, partitioning, etc.) or prescribed by the user (components,
coupling fields, coupling frequencies, etc.) is defined only once initially.
The component model processes and their corresponding rank and location
(processor, node) are fixed from the beginning to the end of the simulation.
-
with respect to the process management (PM dynamic):
Component models and/or additional coupling processes can be launched
during the simulation.
An example is a coupled simulation that starts with a reduced number
of component models, these components first reaching some kind of equilibrium
before other components are started. One could also think of a regional
coupled simulation starting only with an atmosphere and an ocean component
models, and in which a sea ice model is activated only if the sea surface
reaches freezing conditions.
-
with respect to the coupling exchange characteristics (CE dynamic):
The characteristics of the coupling exchanges (fields, frequencies,
transformations, etc.) are allowed to change during the simulation.
One example is a coupled simulation in which a particular coupling
field is exchanged only if a particular scientific condition is met.
-
with respect to the coupling field characteristics (CF dynamic):
The characteristics of the coupling fields themselves (units, grid,
partitioning, etc.) are allowed to change during the simulation. Transfer
of information between the model and the rest of the coupled system must
be possible at any point during the simulation.
One example is a coupled simulation in which one or more components
are subject to dynamic grid refinement.
The user can modify the coupling parameters at run-time: an interactive
coupling is necessarily dynamic. All implications of a dynamic coupling
are all also valid for a interactive coupling, with the added implication
that the information prescribed by the user must be transferable to the
coupled system at run-time.
As a dynamic coupled simulation, a simulation can be interactive
-
with respect to the process management (PM interactive)
-
with respect to the coupling exchange characteristics (CE interactive)
-
with respect to the coupling field characteristics (CF interactive)
2. Possible coupling relations between two components
The coupler will co-ordinate the execution of several major climate
component models. It is firstly important to analyse the coupling relations
that can exist between any two of these components.
Two components can be sequential by nature or concurrent by
nature. It is also possible to force two components sequential by nature
to run concurrently, or two components concurrent by nature to run sequentially;
we will refer to two components having one of these relations respectively
as concurrent by construction, or sequential by construction.
-
sequential by nature: two models are sequential by nature if the
first model necessarily waits while the second model is running, and vice-versa.
The sequence is imposed by the exchange of coupling fields.
-
concurrent by nature: two models are concurrent by nature if coupling
data produced by one model depend on other coupling data produced previously
by the other model during the same timestep, and vice-versa.
-
concurrent by construction: two models are concurrent by construction
if they are sequential by nature but forced to run concurrently. This requires,
at a given timestep, that coupling data produced at the preceding timestep
are used as input.

-
sequential by construction: two models are sequential by construction
if they are concurrent by nature but forced to run sequentially. This requires,
at a given timestep, that coupling data produced at the preceding timestep
are used as input.
Part II. Possible requirements and design options for
the PRISM coupler
The main constituents of the coupler are: the Driver, the Transformer,
and the PRISM System Model Interface Library (PSMILe) which interfaces
the model with the rest of the coupled system, and therefore includes the
Data Exchange library (DEL).
1. General requirements
-
The overhead associated to the global system modularity and flexibility
is acceptable.
-
The whole system is portable and efficient on the different hardware architectures
used for climate modelling, on dedicated or shared hardware resources.
Standard and portable solutions should be preferred. However, for critical
issues for which a portable solution does not exist or would lead
to very low efficiency, machine dependent options could be offered.
-
The design and implementation lead to code easy to maintain and which can
be easily modified to support future model or coupling functionalities.
-
Design reflects a clear separation of responsibilities for the different
parts of the coupler.
-
The PRISM System infrastructure can be used to technically assemble a coupled
system based on any component models, even if these models do not conform
to the PRISM physical interfaces, given that they include the well defined
PRISM System Model Interface Library.
-
The PRISM System infrastructure can be used to couple an arbitrary number
of component models; any component can be one-way or two-way coupled with
any other component.
2. Driver requirements and design options
The Driver manages the whole coupled application. It may launch the
component models, monitor their execution and termination, orchestrate
the exchanges of coupling data, centralise and distribute simulation parameters
which require a consistent definition among all component models, and centralize
and distribute information on the component model status during the simulation.
A design option is to decentralize the coupling
functionalities as much as possible in the Data Exchange library included
in the different model interface libraries and in the Transformer, and
therefore to reduce as much as possible the role of the Driver. This option
is probably applicable only for static coupled simulations and allows
an easier evolution toward heterogeneous coupling (different component
models running on different machines).
Model execution and control:
-
The Driver can control model execution concurrently, in a regular sequence
(one after the other), or in some pre-defined combination of these two
modes.
-
The Driver manages static simulations; this is the minimal option and the
simplest one to implement.
-
The Driver can control dynamic model execution (one or more models may
start and end at pre-determined points in the simulation). As presented
in part I 1., this functionality will be required for scientific reasons
if it is decided that the PRISM System should support dynamic coupled simulations.
A discussion on the technical advantages and disadvantages of a dynamic
Driver (with respect to the process management) is presented in Annexe
I.
-
The Driver can control conditional model execution (one model is started
during the simulation only if a particular condition is met).
-
The Driver can control interactive coupled simulations.
-
The Driver can control a global coupled system flexible in terms of executables
(extremes are: each component is a separate executable -MPMD-, or all components
run in parallel or in sequence within only one executable -SPMD).
-
The Driver can take advantage of extra hardware resources as they come
available, within a static envelope.
-
The Driver can give some statistic on the load balancing of the run.
-
The Driver includes a timing which allows it to sample with identical absolute
time for all component models the duration of events.
-
The Driver warns the user if he tries to assemble an invalid combination
of component models.
-
The Driver warns the user if the coupling and I/O frequencies are not synchronized.
Information management:
-
The Driver centralizes the universal parameters, i.e. the parameters that
need to be consistently defined in the coupled system (initial date, length
of integration, calendar, earth radius, solar constant, restart saving
frequency, etc.). This information can be defined by the user or by one
master model (the atmosphere). The Driver transfers this information to
all component models.
(In a decentralized approach, the universal parameters would
be read in directly by each model PSMILe.)
-
The Driver centralizes all model information (grid definition, distribution,
etc.). For CF dynamic simulation, this information may evolve at
run-time. The Driver transfers this information to the rest of the coupled
system, when and where required.
(In a decentralized approach, each model PSMILe would be responsible
for transferring the appropriate information to the appropriate processes.)
-
The Driver centralizes information on the state of all component models
in the simulation and transfers this information to a higher level controlling
layer or to the user.
(In a decentralized approach, each model PSMILe would be responsible
for transferring its status information to a higher level controlling layer
or to the user.)
Coupling exchange management:
Termination and restart:
-
The Driver ensures that the whole simulation shuts down cleanly (regular
and unforeseen termination) in an intelligent way (e.g. after restart is
saved) and report error if one component aborts.
-
The Driver constantly updates, by writing in a restart log file or by any
other equivalent mean, the last date for which all model restarts were
saved. In case the coupled system is re-started after an unforeseen termination
(machine breakdown, ...), the Driver automatically finds in the restart
log file the appropriate date of restart; it restarts the component models
and transfer this restart date to all of them.
-
The Driver is able to shutdown the simulation cleanly if a specific scientific
condition is met (e.g. average SST exceeds some predefined value).
3. Transformer requirements and design options
The Transformer performs on the coupling data all transformations required
between two component models.
3.1 List of possible transformations and other requirements
The Transformer may provide the following transformations:
-
Time operations:
-
time averaging or sum
-
time interpolation
-
minimum or maximum over a certain time range
-
time variance
-
other time operations
-
2D spatial interpolation:
-
nearest-neighbour (Ex: NNEIBOR)
-
nearest-neighbour Gaussian weighted (Ex: GAUSSIAN)
-
bilinear (Ex: BILINEAR)
-
bicubic (Ex: BICUBIC)
-
1st order conservative remapping (Ex: SURFMESH)
-
2nd order conservative remapping
-
higher order conservative remapping
-
remapping using user-defined remapping info (e.g. runoff remapping) (Ex:
MOZAIC)
-
other
-
3D spatial interpolation:
-
nearest-neighbour
-
nearest-neighbour Gaussian weighted
-
bilinear
-
bicubic
-
1st order conservative remapping
-
2nd order conservative remapping
-
higher order conservative remapping
-
remapping using user-defined remapping info
-
other
-
1D spatial interpolation:
-
nearest-neighbour
-
nearest-neighbour Gaussian weighted
-
bilinear
-
bicubic
-
1st order conservative remapping
-
2nd order conservative remapping
-
higher order conservative remapping
-
remapping using user-defined remapping info
-
other
-
Other transformations:
-
Conservation: ensure global energy conservation between source and target
grid (Ex: CONSERV)
-
Combination: of different parts of different coupling fields or of other
predefined external data (Ex: FILLING)
-
Algebraic operations: with possibly different coupling fields or predefined
external data and numbers as operands (+, -, X, SQRT, ^2, ...) (Ex: BLASOLD,
BLASNEW, SUBGRID, CORRECT)
-
Spatial maximum or minimum, possibly relative to a threshold (MAX(var,0):
maximum of positive values).
-
Specific algebraic transformations
-
Celsius <-> Kelvin
-
Degree <-> Radian
-
Indexing operations:
-
Mask: Only the points listed in index have meaningful data and the others
are changed to missing (Ex: MASK)
-
Scatter: scatters the model data onto the points listed in index
-
Gather: gathers from the input data all the points listed in index
-
Spatial "collapse" operations: collapse of any dimension or combination
of dimensions by various -possibly weighted-statistical operations (mean,
max, min, etc.)
-
Subspace: extraction of subspaces or hyperslabs in any combination of spatio-temporal
or other dimension
-
Others
The Transformer may support the following grid types:
-
horizontally:
-
Cartesian, i.e. the location of each grid point is given as a 2D array
(i, j)
-
regular or irregular or stretched in longitude and in latitude
-
regular in longitude for each parallel, but unstructured in latitude ("Reduced"
atmospheric grid).
-
unstructured in longitude and in latitude
-
staggered
-
vertically:
-
V1 - Reproduction of the same horizontal grid at different levels
The same horizontal grid is reproduced at different vertical levels.
Each level has its particular mask. The vertical levels can be:
-
V1-1 : given at regular or irregular depth or height levels (z coordinate)
-
V1-2: hybrid: first level follows the topography (atmosphere models) or
the bathymetry (ocean models), last level follows an isobar (atmosphere)
or the surface (ocean), progressive transition in between.
-
V1-3 : given at regular or irregular isopycnal (density) levels (r coordinate)
-
V2 - Different horizontal grids at different levels
The horizontal grid is not reproduced at different vertical levels.
The horizontal grid can be rotated, translated, or totally unstructured.
-
other grid characteristics:
-
may have masked grid points.
-
may have "holes" (i.e. they do not cover the whole sphere).
-
may be global or regional.
-
may have overlapping grid points.
Other requirements for the Transformer may be:
-
To support scalar coupling data.
-
To support vector coupling data in the standard spherical geographical
coordinate system.
-
To support vector coupling data in any set of local co-ordinate system.
-
To support fields with undefined variables.
-
To support source and target coupling domains that totally or partially
overlap (e.g. global atmosphere with a regional ocean, regional model nested
into global model, etc.)
-
For basic remapping (1st order conservative), to be able to calculate automatically
the remapping info -address and weights.
-
To be able to give some statistics and diagnostics on the coupling fields
(mean, max, min, etc.) (Ex: CHECKIN, CHECKOUT in Oasis)
-
To support coupling fields which characteristics may change over time as
simulation develops (grid, resolution, distribution, ...).
-
To be able to save, at a user-defined frequency, its restart data (e.g.
time accumulated data).
-
To understand some standard conventions of meta-data.
-
Based on meta-data description, to perform compatibility checks between
data produced by source component and data required by target component.
-
Based on meta-data description, to recognize type of coupling data (flux,
vector, scalar) and verify that the user's transformation choice is appropriate.
-
Based on meta-data description, to perform automatically some basic transformations
(units -e.g. Celsius to Kelvin, order of dimensions -e.g. source (x,y,z)
-> target (z,y,x), etc.), even if not prescribed by the user.
3.2 Design options for the Transformer location
The transformations can be divided into 3 types:
-
Point-wise transformation: an operation that can be completed
on each grid point without any external information, neither from the neighbouring
grid points, neither from another model. Thus they can be done locally
without geographical knowledge and with any domain decomposition, such
as time averaging.
-
Local transformation: an operation that can be completed
in a model without any information from another model, such as finding
the maximum value of a field.
-
Non-local transformation: an operation that require information
from another model, such as interpolation.
To perform these transformations, the Transformer needs information
coming from the models (e.g. field units, grid, partitioning, etc.), and
information prescribed by the user (e.g. the nature of the transformations).
In a static
coupled simulation, these informations may be transfered
only once initially; in a dynamic
or interactive coupled
simulation, these transfers must be possible at any point during the simulation.
The minimal option is that all transformations are necessarily performed
in a Transformer entity distinct from the component model processes, as
it is the case for the OASIS coupler today.
A desirable option is that at least point-wise transformations are performed
locally on the component model processes by transformation routines included
in the PSMILe, before any external exchange. This should be reasonably
easy to achieve as it does not require any parallelisation of the transformation
routines, and is recommended in some cases to avoid extra communication.
All other transformations are performed in a separate Transformer
entity.
Another option is that point-wise and local transformations are performed
locally in the component model PSMILe before any sending or after receiving
the coupling fields. All non-local transformations are performed
in a separate Transformer entity.
3.3 Design options for the Transformer parallelisation
In the case the option of performing
point-wise
and local transformations directly in the PSMILe is chosen,
the
full parallelisation of the local transformation routines is required as
the PSMILe will in some cases be linked to fully parallel component models.
For the separate Transformer entity performing
non-local transformations, the following parallelisation options are possible:
Pseudo-parallelisation:
Between any two models, there are more than
one separate sequential Transformer processes, each one treating an equal
number of fields exchanged in both directions between any two models. This
approach may ensure a better load balance but implies that each process
has to calculate and store the information required for the transformation,
e.g. the interpolation matrix of weights and addresses, in both directions.
Furthermore, it also implies that each model information has to be duplicated
in the different related Transformer processes.
-
Field-per-field unidirectional approach:
Between any two models, there are two separate
sequential Transformer processes, each one treating the coupling fields
exchanged in one direction. With this approach, each process calculates
and stores only the information to do the transformation in one direction;
this advantage is less significant if the transformation information in
one direction can be automatically deduced from the one in the other direction.
The main disadvantage is that the load balance between the separate Transformer
processes can no longer be ensured.
-
One executable full parallelisation:
The full parallelisation of the separate Transformer
entity is mandatory if one separate Transformer executable performs all
non-local transformations required between all components; the computing
load of this separate Transformer executable will be important and it is
therefore important that it is parallel and scalable. (Of course, direct
communication between any two components for which no transformations are
required, with or without repartitioning, should still be possible.)
One advantage of this option is that the information
about each model, for example the grid, is duplicated only once, namely
in this separate parallel Transformer executable memory. Another advantage
is that it ensures an efficient use of these Transformer processes, especially
if the different data exchanges between any two models occur not simultaneously.
Furthermore, if the different coupling fields are available at the same
time in the models, they can be packed into one message; this would present
the advantage of reducing the number of messages exchanged and increasing
their individual size.
One disadvantage is that the partitioning of
this additional parallel executable can match the partitioning of only
one model; repartitioning is therefore required for the other model components.
-
Many executables full parallelisation
If the repartitioning required for the above
option proves to be too expensive, the option of using one additional parallel
Transformer executable between any two component models could be envisaged;
the partitioning of each additional parallel Transformer executable could
match the partitioning of one of the two models.
4. PRISM System Model Interface Library & Data Exchange library
requirements and design options
The PRISM System Model Interface Library (PSMILe) is the set of routines
implemented in the model code to interface it with the rest of the PRISM
System (other component models or additional coupling processes).
4.1 PSMILe general requirements
The possible requirements specific to the PSMILe are the following:
-
The modifications to implement in the model code are as reduced as possible.
-
The PSMILe is layered, and complexity is hidden from the component code.
-
The PSMILe includes the Data Exchange library as the most external layer.
-
The PSMILe may include transformation routines if transformations are required
locally before the exchange with the rest of the PRISM System.
-
The component models are able to run in a stand-alone model without modifications,
with or without an external Driver.
-
The model information (e.g. length of a time step) is given by the model
through the PSMILe and not duplicated externally by the user; this information
may change during the simulation.
-
The description of the data, i.e. the meta-data (e.g. units, grid coordinates,
mask, length of a time step, distribution, ...), is given by the model
through the PSMILe and not duplicated externally by the user; this information
may change during the simulation.
-
A good trade-off is chosen between (I) a concise list of parameters for
each subroutine call (more subroutines provided with a shorter list of
parameters) and (II) a small number of subroutines, each one having a longer
and more complex parameter list. The complexity arises from the need to
transfer not only the coupling data but also the meta-data.
-
Interface routines once defined and implemented are not subject to modifications
between the different versions of the PRISM coupler. However, new routines
may be added.
-
The PSMILe is extendable to new types of coupling data (e.g. data given
on arbitrary grids).
4.2. Data Exchange Library requirements
The Data Exchange library (DEL) performs the exchanges of coupling data
between the component models, or between the component models and the separate
Transformer entity. The DEL must therefore be included as the most external
layer in the PSMILe.
The possible characteristics of the coupling data exchanges are:
"End-point" data exchange: when producing coupling data, the source model
does not know what other model will consume it; when asking for coupling
data a target model does not know what other model produces it.
The coupling data can be of different types: integer, real, character,
1D-2D-3D-xD arrays, structures, operators, functions, ...).
The coupling data are exchanged but also possibly on their metadata, i.e.
the description of the data (e.g. units, grid coordinates, mask, distribution,
...).
The coupling fields characteristics, and therefore the associated metadata,
may change over time as the simulation develops (grid, resolution, ...)
Coupling data produced by a source model can be consumed by more than one
target model.
Coupling data produced by one model may be only partially consumed by the
target model; extraction of subspaces, hyperslabs or indexed grid points
may be required before the exchange.
Different coupling data produced by one model may have to be combined before
the exchange.
Algebraic operations may have to be performed on the coupling data before
the exchange.
Coupling data produced by a source model can be consumed by the target
model at the different frequency (i.e. one "put" will not necessarily
match one "get" -time integration/interpolation will be required).
Occurrence of the exchange can be different for the different coupling
fields.
Occurrence of exchange is flexible (exchange can occur at a fixed frequency,
at different pre-defined timesteps, on given dates of a physical calendar
-Julian, Gregorian, ...-, etc.).
Coupling data produced from one model at a particular time may be required
as input coupling data for another model at another time.
Occurrence of the exchange is not necessarily defined initially by the
user; it can depend on parameters dynamically calculated during the simulation
(conditional occurrence).
Exchange points can be placed anywhere in the source and target code and
possibly at different location for the different coupling fields.
The exchange can occur directly between two component models without going
through additional coupling processes.
When the component models are parallel and have different data partitioning,
repartitioning associated to direct communication is required; all type
of distributions usually used in model component codes are supported. In
a static coupled simulation, the characteristics of the repartitioning
required between any two component models are fixed, while in a PM
or
CF
dynamic coupled simulation, they may change during the simulation.
Other specific requirements are:
Data exchange implementation:
The DEL offers efficient data exchange implementations for loose and
strong coupling. Loose coupling is the configuration in which the two component
models are run sequentially or concurrently as two separate executables.
Strong coupling is the configuration in which the two component models
are run within the same executable.
I/O and access to data files:
In some cases, input coupling data will not be provided by another
model but should be read into a file indicated by the user in the coupling
configuration file. This should be transparent for the component model
and managed automatically.
The format of these data files could be a standard PRISM fixed format.
At a later stage, different formats could be supported for these data files;
this would imply that the instance reading the file can interpret their
content.
For parallel component models, the I/O library will have to address
the parallel I/O issue. One option is simply to avoid parallel I/O by doing
regional selection for input data and by doing postprocessing operation
after a simulation to recombine multiple output files provided by the parallel
execution. MPI-IO is another option. Finally a third option
is to set up a dummy application or I/O demon, which just acts as data
source by reading the file and behaves like a regular model with respect
to the coupled system. This last option is particularly interesting when
the data present in the file need transformation, interpolation or repartitioning
before being used by the model, and therefore is particularly interesting
for parallel models. It is also interesting from the performance point-of-view
if the I/O demon can perform the access to disk concurrently with
the model calculations. However, it supposes that a Driver and an external
I/O demon are active even for a component model running in a totally uncoupled
mode.
Matching of output and input coupling data from different component models
As discussed above, the DEL could perform, for static simulations,
the matching between output coupling data produced by one model and input
coupling data requested by another model. The matching may be based on
the user's choices indicated in the configuration file, or may be done
automatically when there is only one matching possibility. For dynamic
simulations, information coming from the Driver is required.
4.3 An important design option: a common Model Interface Library for
I/O and coupling data
I/O and coupling data present many common characteristics and should
therefore, in principle, share a common Model Interface Library. It should
be evaluated further if this ideal concept can in fact be implemented without
too many constraints.
The following list of characteristics shared by both I/O and coupling
data was established:
Data requested or made available by a model. Some data may be I/O and coupling
data at the same time.
For available data, not all will be effectively delivered by the model.
For each particular simulation, the user has to activate some of them externally
through a configuration file created with a GUI or any other mean.
Data for which the "end-point data exchange" principle is applicable. The
model itself does not know where the data come from or where they
go to. The source/target models (for coupling data) or the source/target
files (for I/O data) are defined externally by the user for each particular
simulation.
Data for which transformations may be required. These transformations are
prescribed externally by the user.
Some data required from another model in the coupled mode, may in fact
be forcing data read directly from files. In that case, the coupling
library is faced with the same parallel I/O and metadata interpretation
difficulties.
The following list of differences was established:
List of coupling fields is generally smaller that the list of diagnostic
output .
One PRISM objective is to define a standard physical coupling interface
between any two components, i.e. the nature of the coupling fields exchanged;
standardisation of the nature of the diagnostic output will be much more
limited.
Some local transformations required for I/O may not be required for coupling,
and vice-versa.
I/O may require more or different metadata to be transferred from the model.
I/O data needs some mechanism to translate metadata given by the model
into CF-style description. This is required for coupling data only if the
coupler is asked to generate its own coupling diagnostics files.
Part III. Coupler review
This section surveys existing couplers or coupling applications, targeted
or not to climate:
OASIS from CERFACS
Palm from CERFACS
MpCCI from FhG-SCAI
Calcium from EDF
CCSM Coupler 6 from NCAR
Distributed Data Broker from UCLA
Flexible Modeling System from GFDL
Coumehy from LTHE and IDRIS
For additional information on projects targeting or involving coupling
aspects, on potential underpinning technologies, and on model developments
related to coupling, the reader is invited to consult the document entitled
"Met Office FLUME project - Model Coupling Review" (http://www.metoffice.com/research/interproj/flume/1_d3_r8.pdf),
written by R. W. Ford and G. D. Riley from the University of Manchester.
1. OASIS (http://www.cerfacs.fr/globc/software/oasis/oasis.html)
OASIS is the coupler developed at CERFACS, primarily designed for coupling
climate models, which will be the base of the PRISM coupler developments.
The initial work on OASIS began in 1991 when the ``Climate Modelling
and Global Change'' team at CERFACS was commissioned to build up a French
Coupled Model from existing General Circulation Models (GCMs) developed
independently by several laboratories (LODYC, CNRM, LMD). Quite clearly,
the only way to answer these specifications was to create a very modular
and flexible tool.
OASIS is a complete, self consistent and portable set of Fortran 77,
Fortran 90 and C routines. It can run on any usual target for scientific
computing (IBM RS6000 and SPs, SPARCs, SGIs, CRAY series, Fujitsu VPP series,
NEC SX series, COMPAQ, etc.). OASIS can couple any number of models and
exchange an arbitrary number of fields between these models at possibly
different coupling frequencies. All the coupling parameters for OASIS (models,
coupling fields, coupling frequencies, etc.) of the simulation are defined
by the user in an input file namcouple read at run-time by OASIS.
Each component model of the coupled system remains a separate, possibly
parallel, executable and is unchanged with respect to its own main options
(like I/O or multitasking) compared to the uncoupled mode. OASIS handles
only static simulations, in the sense that all component models are started
from the beginning and run for the entire simulation. The models need to
include few low-level coupling routines to deal with the export and import
of coupling fields to/from OASIS.
The main tasks of OASIS are:
Communication between the models:
To exchange the coupling fields between the models and the coupler
in a synchronized way, four different types of communication are included
in OASIS. In the PIPE technique, named CRAY pipes are used for synchronization
of the models and the coupling fields are written and read in simple binary
files. In the CLIM technique, the synchronization and the transfer of the
coupling data are done by message passing based on PVM 3.3 or MPI2. In
particular, this technique allows heterogeneous coupling. In the SIPC technique,
using UNIX System V Inter Process Communication possibilities, the synchronization
is ensured by semaphores and shared memory segments are used to exchange
the coupling fields. The GMEM technique works similarly as the SIPC one
but is based on the NEC global memory concept.
Transformation and interpolation of the coupling fields:
The fields given by one model to OASIS have to be processed and transformed
so that they can be read and used directly by the receiving model. These
transformations, or analyses, can be different for the different fields.
First a pre-processing takes place which deals with rearranging the arrays
according to OASIS convention, treating possible sea-land mismatch, and
correcting the fields with external data if required. Then follows the
interpolation of the fields required to go from one model grid to the other
model grid. Many interpolation schemes are available: nearest neighbour,
bilinear, bicubic, mesh averaging, Gaussian. Additional transformations
ensuring for example field conservation occur afterwards if required. Finally,
the post processing puts the fields into the receiving model format.
2. Palm (http://www.cerfacs.fr/globc/PALM_WEB/)
The PALM project, currently underway at CERFACS, aims to provide a coupler
allowing a modular implementation of a data assimilation system. In this
system, a data assimilation algorithm is split up into elementary "units"
such as the observation operator, the computation of the correlation matrix
of observational errors, the forecast model, etc. PALM ensures the synchronization
of the units and drives the communication of the fields exchanged by the
units and performs elementary algebra if required.
This goal has to be achieved without a significant loss of performances
if compared to a standard data assimilation implementation. It is therefore
necessary to design the PALM software in view of the following objectives
and constraints:
modularity: PALM provides a mechanism for synchronization of pre-defined
functional units that can be executed in sequence, in concurrence, or in
a mix of these two modes. One key aspect of PALM is also that dynamic execution
of the units (i.e. units can be launched and stopped at any point during
the simulation) or conditional executions of the units are allowed. PALM
also performs the required exchange of information between these units.
portability: PALM aims to run on all the existing high-performance platforms
and, if possible, on the next generation super-computers. This effort of
"clairvoyance" can be accomplished only through the adoption of standard.
performances: PALM will be used in two modes: research and operational.
The research mode will be used for the design of new algorithms and will
prioritise the flexibility; on the contrary, the operational mode will
work with a fixed configuration of the algorithm and will prioritise the
performances optimisation and the monitoring of the process.
PALM is a very flexible and efficient, but somewhat complex, tool. For
PRISM, it remains to be evaluated if this flexibility, and associated complexity,
are required for coupled climate modelling.
3. MpCCI (http://www.mpcci.org/)
The Mesh-based parallel Code Coupling Interface (MpCCI) is a coupler
written for multidisciplinary applications by the Fraunhofer Gesellschaft
Institute for Algorithms and Scientific Computing (FhG SCAI). MpCCI enables
different industrial users and code owners to combine different simulation
tools. MpCCI can be used for a variety of coupled applications like fluid-structure,
fluid-fluid, structure-thermo, fluid-acoustics-vibration, but was not explicitly
designed for geophysical applications.
MpCCI is based on COCOLIB developed during the CISPAR project, funded
by European Commission, and on GRISSLi-CI developed during the GRISSLi
project, funded by the German Federal Ministry for Education and Research.
MpCCI is not an open source product, but the compiled library offering
basic functionality can be downloaded for free from the web site; special
agreements apply for add-on features like e.g. more sophisticated interpolation
schemes..
MpCCI is written in C++ and is based on MPI1. MpCCI is mainly a parallel
model interface library which provides the usual coupling functionality:
1- the interpolation of the coupling fields (including the neighbourhood
search and calculation of weights and addresses), and 2- the exchange of
coupling data between the codes (including data repartitioning when required).
MpCCI also consists of a separate control process which performs only a
simple monitoring the different codes, as MpCCI handles only static couplings.
The coupling is performed by placing MPI like sending and receiving
instructions in the coupled codes. MpCCI does not fully adhere to the principles
of "end-point data exchanges" as each model has to know the target/source
of its sending/receiving instructions. However, each code simply works
with its own local mesh and needs no specific knowledge of the other code
characteristics.
As MpCCI is based on MPI1, heterogeneous is supported as long as the
MPI1 implementations of the different platform implied in the coupling
allows it.
4. Calcium (http://www.irisa.fr/orap/Publications/Forum8/berthou.pdf)
Calcium is a coupler of codes developed by Electricite De France (EDF),
and written, as MpCCI, for multidisciplinary applications. Calcium ensures
the exchanges of coupling data between the codes in a synchronized way.
The exchanges are based on PVM and heterogeneous coupling is allowed. Furthermore,
Calcium automatically performs the temporal interpolation of the coupling
data when the sending frequency of the source code does not match the receiving
frequency of the target code. Calcium is used by about 10 different research
or industrial groups mainly in France and is implemented in about 20 codes.
5. CCSM Coupler 6 (http://www.ccsm.ucar.edu/models/cpl6)
The Next Generation Coupler (NGC) - also called cpl6- is the coupler
being currently developed at NCAR, for the next version of the Community
Climate System Model (CCSM), in the framework of the Accelerated Climate
Prediction Initiative (ACPI) Avant Garde Project. They have performed the
requirement capture, have described a design and are presently in the development
phase.
The main characteristics of the NGCc are:
The coupler is written in F90 and explicitly designed to couple four models:
atmosphere, land, ocean, sea ice. No flexibility concerning the number
of models or their nature is included. Due to the nature of these components,
exchange of 2D fields only is supported. These four models can run concurrently,
sequentially or in a mix of these two strategies. Each component and the
coupler are separate executables (MPMD paradigm).
The coupler can run in parallel decomposed into an arbitrary number of
processors, and supports the following type of parallelism: pure shared-memory,
pure message-passing, and hybrid parallelism incorporating threading on
multiprocessor nodes and message passing between the nodes.
The transformations performed by the coupler include interpolation (conservative
remapping using the SCRIP library), merging of coupling data originating
from multiple source grids, time-accumulation and time averaging, diagnostic
computing, and writing of history data sets, and also computing of certain
interfacial fluxes between components. This choice was made as the fluxes
need to be calculated at the higher resolution AND at the higher required
frequency which may be characteristics belonging to different models.
All coupling data exchanges are performed with MPI. Parallel exchange of
coupling fields and repartitioning is possible. However, all coupling field
exchanged between any two components have to go through the coupler where
the transformations are performed; direct component to component exchanges
are not allowed.
One can note here that the goal of achieving efficient vector processing
performance was not identified as a mandatory requirement for the NGC.
6. UCLA Distributed Data Broker (http://www.atmos.ucla.edu/~drummond/DDB/)
The Distributed Data Broker (DDB) is a software tool designed to handle
distributed data exchanges between the UCLA Earth System Model (ESM) components.
These components are: Atmosphere General Circulation Model, Atmospheric
Chemistry Model, Ocean General Circulation Model, Ocean Chemistry Model,
and are run as separate executables. The DDB is composed of the Registration
Broker (RB) and of three libraries linked to the component models: the
Model Communication Library (MCL), the Communication Library (CL), and
the Data Translation Library (DTL).
The Registration Broker is a process that collects model information
from the models initially. The RB is only active at the beginning of the
coupled run; thus, any model process implied in the coupling can take this
role and later resume with normal model operation. The DDB follows the
"end-point data exchange" principle, which they call the producer-consumer
paradigm. In the registration phase, the different models register their
"production" and "consumption" of coupling data and RB performs the appropriate
matching which will be effective at run-time.
The MCL contains a set of callable routines that are used by the different
component models to register at the beginning of a run, and perform the
exchanges of data at run-time.
The CL is a set of routines used by the DDB to manage data exchanges
based on the communication libraries available on the computer platforms.
At run-time, the model producing the data sends the data directly to the
consuming model at a given time interval; the consuming model will later
receive the data at a rate dictated by its internal computations. Heterogeneous
coupling is allowed as long as the communication libraries available on
the computer platforms support it.
The DTL transforms data in a given grid domain to the domain of the
requesting model. This library will include routines ranging from simple
linear interpolation to high order data translation routines
7. GFDL Flexible Modeling System (FMS) (http://www.gfdl.noaa.gov/~fms/)
The design of the FMS is geared toward coupled climate models running
as a single executable. The component models that can be included into
the FMS are atmosphere, land, ocean and ice models. In the FMS, the coupler
is the main program which drives the components models. To interact, these
components communicate only with the coupler. They may be on different
grids and have different data decompositions and the coupler manages the
transformations required between them. Recently, some parallelisation concepts
were experimented on the component models themselves, using abstract parallel
dynamical kernels: the parallelism is in fact built into basic operators
invoked in the model, including arithmetic operators as well as differential
operators such as curl, div, grad and laplacian.
8. COUMEHY (contact: C. Messager, messager@hmg.inpg.fr)
COUMEHY is a ungoing French coupling project, involving the "Laboratoire
des Transferts en Hydrologie et Environnement", "Hydrosciences Montpellier",
the MEVHySA team from the "Institut de Recherche pour le Developpement"
from Montpellier, and the "Institut du Developpement et des Ressources
en Informatique Scientifique". The objective of this scientific and technical
project is to couple one atmospheric model to different hydrological models
running on different platforms, in order to evaluate the importance of
coupling processes between atmospheric and continental hydrological cycles,
in the climate global change perspective. The inter-operability of different
codes running on different machines was in that case a strong requirement:
the choice was therefore made to base the communication on CORBA (Common
Object Request Broker Architecture).
Annexe I
Technical advantages and disadvantages of a dynamic
Driver
Technical advantages and disadvantages of a dynamic Driver with respect
to the process management (PM dynamic Driver) are discussed here.
A PM dynamic Driver may technically be required for the following reasons:
To avoid waste of computing resources:
Two component models, being two different executables, are run sequentially.
In a static configuration, when the first model runs, the other is simply
waiting, and vice-versa, this alternation occurring at each coupling timestep.
For computing platforms on which the Operating System (OS) cannot efficiently
swap the waiting model, this results in a waste of computing resources.
To avoid this waste, a dynamic Driver could launch each model in turn
at the coupling frequency. This option could especially be considered for
models or coupling processes having a comparatively small computing load
as illustrated below. This option implies that the temporal coupling loop
is controlled by the Driver. The disadvantage is that the restart procedure
of the model and the loading of appropriate data have to be done each time.
To run a coupled model which total memory is greater then available:
The memory required for a coupled model including all PRISM components
may be greater than the total memory available, but not all models may
be not active at the same time. Once again the static option is sufficient
if an efficient OS swap functionality exists on the machine, but if this
is not the case, a dynamic Driver, launching the components when required,
may be useful. As above, the disadvantage is that the restart procedure
of the model and the loading of appropriate data have to be done each time;
furthermore, it will probably be difficult in most
cases to predict precisely which component will be active at the same time
or at different times.
If the Driver has to manage dynamically its own buffering processes:
If message passing is used to exchange the coupling data, and if two
components exchanging a high number of 3D fields are run sequentially or
simply not well synchronised, the messages will pile up into the message
passing mailbox which capacity may then be exceeded. In that situation,
the Driver may have to manage dynamically its own buffering
processes. However, on most platforms, the message passing mailbox
can be as large as the available memory; if this is the cases, this third
argument is not longer valid.