PrePALM attribute of a communication
sending an object to the buffer. It means that the object is linearly
combined with a previous instance of the same object (same name,
time, and tag) in the
buffer. The user provides the linear cofficients for the old and
the new version of the object: for instance the coefficients (1.,
1.) correspond to "old + new", the coefficients (1., -1.) to "old -
new", and (0.5, 0.5) to the average of the two. This is a shortcut for
assembling objects in the buffer and
can be replaced by a more general linear combination taken from the algebra toolbox. An object stored in the
buffer via an "" communication has the "not ready" state and must be set to "ready" with an action described among the step-actions before being received.
The user defined algebra macro-units are
obtained by composition of units from the algebra
toolbox, Fortran code and control structures. The algebra composer
is the graphical tool to describe the composition. The description is
stored in the files which can
imported and exported
by different instances of PrePALM. This feature allows for the exchange of
algebra macro-units among users. Starting
from the files
In order to synchronize two or more branches it is possible to
force a rendezvous. A barrier is enabled by the attribute
of a primitive
invoked by the concerned branches. The branches will be blocked until
every concerned branch has reached the step.
For distributed objects: it indicates the
rank of the descriptor reference unit process(or)
(cf. the entry).
A branch is a sequence of PALM units. Units belonging to the same branch
will be executed sequentially, but several branches can run
concurrently. A branch can start at the beginning of the execution
( attribute) or can be invoked by another branch
via the Start instruction.
A branch is defined by a control code in the input file. At the generation of the PALM files each
branch generates a FORTRAN 90 file to be linked with the application.
The instructions in a branch are:
control constructs:
palm primitives invoking units or branches
synchronization primitives
communications of scalar (one element) objects
FORTRAN 90 style regions to handle local variables
STANDARD: the gregorian calendar. Leap years are multiples of 4,
not of 100 but of 400 (e.g. 1800 is not leap, 2000 is).
NOLEAP: no leap years at all.
360: all the years are 360 day long, subdvided in twelwe 30 day
long months.
JULIAN: all years which are multiples of 4 are leap (e.g. both
1800 and 2000 are leap).
The PrePALM graphical user interface is
based on the branches graph, with the units in sequence and the communication patterns plus
few indications relative to the algorithm control structures (e.g. do loops)
and some comment texts. The window pan where graphical items
are displayed is called the "canvas".
The mean for a unit (or a branch) to receive (get) or to release
(put) an object.
The PALM paradigm is based on end-point communications. A unit simply
notifies that an object is asked () or made available (). The user defines the
correspondence between the two sides of the communication via the PrePALM interface. Communications can be
established between two ordinary units, between a unit and the buffer and between an ordinary unit and an algebra unit. Because a get is
blocking, communications might be an esay way to synchronize
units.
The LIOO protocol is used.
When a distributed or an algebraunit are launched, the driver dynamically allocates for this purpose
the needed number of idle
processors: these constitute the execution context. There is no
warranty that different successive
instances of the same unit will use the same processors nor that two
units belonging to the same branch can will share the same
processors. It can be
useful in certain cases (use of static variables, re-entering routines
like reverse communication solvers, ...) to force successive
instances of the same unit or two different units to reuse the same
context. It can be obtained by selecting in PrePALM a static execution
context. Notice that this strategy puts strong constraints on
the dynamic allocation of resources and can lead to a deadlock of the
application for lack of resources.
An algebraunit is
dynamically launched by the driver. The driver
has the choice between executing the operation on one of the service
processes allocated to the driver, the buffer or the mailbuff or
to dedicate an idle process to the
algebra unit. class units
are executed on the service processes and class units run on dedicated
processes. The choice is operated in PrePALM when inserting or editing
an algebra unit. Notice that some operations strictly need to run on
dedicated processes (it is the case, for instance, of units
implementing iterative algorithms in reverse communication): in such a
case the user choice is neglected.
In order to diagnostic the results of part of a simulation or to
check the correctness of some data, the user can apply specific
procedures to the objects exchanged via a
communication. These user defined
procedures are implemented in the palm_debug.f90 file which is user
provided and which has to be linked with the application. Debug
is activated or not according to a communication attribute (set via
the PrePALM
interface):
PL_NO_DEBUG No debug procedure is applied
PL_DEBUG_ON_SEND The debug procedure is applied on the source side of the communication.
PL_DEBUG_ON_RECV The debug procedure is applied on the target side of the communication.
PL_DEBUG_ON_BOTH The debug procedure is applied on both sides of
the communication
The debug procedures can produce diagnostic output in the standard
palm output files (unit PL_OUT) and can return an error code which is propagated back to the
user code as an output argument of PALM_Get or PALM_Put. Further details are given in the
chapter about the diagnostic tools.
A distributor can be described at
run-time. In this case the identity card indicates the name of the user
provided function which describes the distributor. A template with the
proper syntax for a distribution function is provided with the PALM releases.
It is the semantic entity in the identity card of a parallel unit which
describes the objects placement in the unit processes.
unit. Three kinds are available
SINGLE_PROC when the object is a non-distributed one.
CUSTOM may describe any kind of object
distribution by specifying coordinates of global object blocks
on each local process on which it is distributed. However, this
distribution is not well suited for describing cyclic distributions
(because each block should be separately identified).
REGULAR may describe block, cyclic and
block-cyclic distribution in an efficient way. Very close to
ScaLAPACK, but extended to 7 dimensions. For more details please refer
to the ScaLAPACK
user guide.
A distributor description is relative to a base processor of the unit, which is numbered 0 in the distributor. It
points to the first process of the distribution (notice that a
distributor may not concern all the processes of a unit).
For a non-distributed (SINGLE_PROC) object sent by only one process of a
parallel unit, the base processor points to the process issuing
Its syntax is explained in session six of Tutorial One.
The driver is the PALM orchestra conductor. It is the
supervisor of the whole application and therefore it drives all the
dynamic decision processes (allocation of idle processors, routing of
communications through the mailbuff, distribution of the buffer , etc ...). Further details on the role
of the driver are given in the section on the PALM run-time
components.
All PALM primitives return an integer error code as last argument. The
error code is 0 if no error occurred. If the error code is not 0, a
mnemonic description of the error can be printed to the standard
palm output files via a call to the
primitive.
Description of units properties, spaces, input/output objects and distributors. Its format is described in the
section on the PALM set up components.
At the beginning of a PALM application all the processors but one
(the driver) are idle. The driver, then,
dynamically allocates processes from the idle processors pool
to branches, units or to the distributed buffer. The processors released by terminated
units, finalized branches or by the shrinking of the buffer join again
the idle processors pool. In case of lack of resources the
driver handles a list of pending requests (N.B. this situation can
lead to deadlock).
In order to optimize the performances of the execution, each
processor keep a local . This is used to avoid
message passing transfers between the processor
and the central mailbuff for pending communications whose source and target will
certainly run on the same processor. This is the case for
mono-processor units belonging to the same branch or for re-entering sticky units. To avoid the local mailbuff
fills up the processor memory, this feature is optionally activated by
the user via PrePALM
Attribute of a communication sending an object to the buffer. It means that the object is
inserted all at once and that it overrides any previous
instance of the same object in the buffer. An object stored in the
buffer via an "" communication has the "ready" state.
The design of PALM is based on the assumption that exchanged
objects share the same grid representation on the source and on the target side. Therefore there is no grid to
grid interpolation in PALM. The only interpolation performed by PALM
is between different time instances of the
same object (cf. the entry).
The action by which a branch invokes
the execution of a unit. When a branch
launches a unit, it is suspended and its processor is used by the
unit. If the unit is distributed over n processes, its execution
actually begins only when (n-1) idle
processors are available.
The PALM library. Provided with the distribution. It has to be
linked with the application.
The Last In Only Out protocol for communications. Following this protocol
the target of a communication will always
receive the most recent version of an object
produced by the source. Freshly produced
versions of an object override the previous ones. With this protocol
there is no need of a message queue, but some synchronization problems
can arise (see the section on the
PALM actions concerning the data exchange for further
explanations).
The predefined units from the algebra
toolbox execute one operation at the time. Inputs are recovered
and outputs are sent or stored via the ordinary PALM communication mechanism. If an algebraic
treatment is constituted by a sequence of basic operations, it is more
efficient to pass by address the intermediate results from one unit to
the following. This can be achieved by defining a macro-unit which
groups several algebraic units and defines how some output and input
objects share addresses. Only some among the units inputs and outputs
will be kept as inputs and outputs of the macro-unit. Moreover, the
user can introduce control structures, as IF statements or loops and
can operate on the intermediate results or on local variables with
Fortran90 code regions. The macro units are described by the user by
the means of the graphical algebra
composer tool.
In order to grant a full independence between the order of
objects production and reception,
the produced objects which are not immediately consumed have to be
stored in a memory space acting as a mailbox. To avoid confusions with
the MPI mailbox, this temporary storage space
has been renamed "mailbuff" because it shares its memory location with the
buffer.
Starting from the input
file which describes the experience, the user interface PrePALM generates the run-time input file and a
number of FORTRAN 90 file to be linked with the application (branch
code, user parameters, palm_init.f90 file, palm_trigger.f90 file)
PrePALM can generate the PALM files in the command line mode (cf. the
section on how to set up a PALM application)
The standard message passing library on top of which PALM is
built. The version 1 of MPI is largely widespread and every
supercomputer constructor provides an optimized version of MPI1. The
MPI1 standard covers the need of SPMD
applications. Further details can be found in the official MPI web
site http://www-unix.mcs.anl.gov/mpi/index.html
The new extended standard of the message passing library MPI. It covers topics on process management,
one-sided communications and parallel I/O which were not addressed in
the MPI1 standard. This version is needed for the implementation of MPMD applications. Further details can be found
in the official MPI web site http://www-unix.mcs.anl.gov/mpi/index.html
Parallel programming algorithm in which the parallel applications
consists in a collection of independent programs executing
concurrently. Processes can join or leave the application dynamically.
The acronym comes from
Multiple Program Multiple Data.
Attribute of a branch which has to be
explicitly invoked by another branch. (cf. the entry).
An object stored in the buffer can be
recovered by a unit issuing a get only if its
state is set to "ready". This is normally
the case of the objects
"inserted" in the buffer ("Insert/Replace" attribute, PL_INS keyword), meanwhile objects assembled in the
buffer via "Add" communications (PL_ADD keyword) or objects explicitly set to "not
ready" (by the unready action) need to have their
state changed by a step-actions instruction
before being received. A PALM_Get trying
to receive an object set to "not ready" will hang until the object
changes its state to "ready".
An object is the information chunk which is exchanged in a
communication. It can be a simple
control code as well as a full three
dimensional field. The object definition is local to a unit and it is given in the unit's identity card, but the two definitions
at the two sides of
the communication (the source and the target) must be conforming and designate the
same information chunk.
An object can be considered as a particular instance of a class,
called a space. Each instance is identified
by three fields:
the time stamp which indicates the time,
in the time reference of the experience, associated to the object. For
example, to treat the temperature field over a period, one has to
consider different objects belonging to the
space of the 3D-fields on the T grid, all with name 'temperature' but
with different time stamps, ranging over all the dates in the
period. An object with no time dependency has time PL_NO_TIME.
the tag, a user defined integer to
distinguish between different versions of an information with the same
"name" and "time". An object for which it is not necessary to operate
such a distinction has tag PL_NO_TAG.
The communication paradigm in
PALM. To ensure modularity, a unit
simply notifies that an object is asked () or made available (). The user defines the
correspondence between the two communication end-points via the PrePALM interface.
Notice that there is no need of exact matching between the
communication primitives in a unit and the communications which are
actually performed. All the potential communications of a unit are
declared in its "identity card", but
only the communications described in PrePALM will be performed. For the others,
the communications primitives and return immediately
with non zero error code. (See also the section on the PALM Data Exchanges).
Primitive to shutdown a PALM application. It ensures a clean
termination and print some information about the state of the
execution in the standard
palm output files.
The call sequence is
CALL PALM_Abort(err_code)
INTEGER :: err_code
err_code
is an output integer error
code. Its meaning can be explained via a call to
The file which contains the user defined debug procedures (cf. the
entry). It has to be linked with the application.
is an input integer error
code returned by a previous call to a PALM primitive.
err_code
is an output integer error
code. Its meaning can be explained via a call to itself.
The PALM primitive called by a unit when an object is needed. Its interface is described in the section on PALM actions for data
exchanges. The unit
calling identifies itself as the potential target of a communication.
Notice that there is no need of exact matching between the
communication primitives in a unit and the communications which are
actually performed. All the potential communications of a unit are
declared in its "identity card", but
only the communications described in PrePALM will be performed. For the others,
returns immediately
with non zero error code.
A call to corresponding to an actual
communication is always blocking.
The call sequence is
The log output file generated on demand. It is used by the PrePALM interface to produce a graphical replay of the
execution and a very simple performances analysis of the units.
The PALM primitive called in a unit when an object is produced. Its interface is described
in the section on PALM actions for data
exchanges. The unit
calling identifies itself as the potential source of a communication.
Notice that there is no need of exact matching between the
communication primitives in a unit and the communications which are
actually performed. All the potential communications of a unit are
declared in its "identity card", but
only the communications described in PrePALM will be performed. For the others,
returns immediately
with non zero error code.
A call is never blocking.
The call sequence is
is a string of predefined length identifying
the object by its .
obj_name
is a string of predefined length indicating
the .
obj_time
is an integer indicating
the object . It can
be computed from a GMT date via a previous call to .
obj_tag
is an integer indicating
the object
local_var
is the local variable of type
XXXX which contains the field associated to the object.
err_code
is an output integer error
code. Its meaning can be explained via a call to
The PALM primitive called by a unit to know if an asked object is going to be produced. It queries if a having the same arguments
corresponds to an actual communication.
Remembre that there is no need of exact matching between the
communication primitives in a unit and the communications which are
actually performed. All the potential communications of a unit are
declared in its "identity card", but
only the communications described in PrePALM will be performed. For these
communications, returns with 0 error code,
for the others, with non zero error code.
The call sequence is
is a string of predefined length identifying
the object by its .
obj_name
is a string of predefined length indicating
the .
obj_time
is an integer indicating
the object . It can
be computed from a GMT date via a previous call to .
obj_tag
is an integer indicating
the object
err_code
is an output integer error
code. It is 0 if the a communication for the corresponding
has been described, else
its meaning can be
explained via a call to
The PALM primitive called in a unit to know if a produced object is going to be consumed. It queries if a having the same arguments
corresponds to an actual communication.
Notice that there is no need of exact matching between the
communication primitives in a unit and the communications which are
actually performed. All the potential communications of a unit are
declared in its "identity card", but
only the communications described in PrePALM will be performed. For these
communications, returns with 0 error code,
for the others, with non zero error code.
The call sequence is
is a string of predefined length identifying
the object by its .
obj_name
is a string of predefined length indicating
the .
obj_time
is an integer indicating
the object . It can
be computed from a GMT date via a previous call to .
obj_tag
is an integer indicating
the object
err_code
is an output integer error
code. It is 0 if the a communication for the corresponding
has been described, else
its meaning can be
explained via a call to
A PALM primitive which can be called in a unit to convert in a unique way a GMT date into
an integer time stamp and vice versa. This
mechanism is needed to make the two sides of a end-pointcommunication independent. The user set
a reference date and chooses a reference time step in the
PrePALM interface. Given a date, the
PALM_Time_convert function returns the number (integer) of time steps
between the reference date and the given date accordingly to the
selected calendar. Given an integer time n,
the PALM_Time_convert function returns the date corresponding to
n time steps after the reference date accordingly to the
selected calendar. To exchange
an object associated to a date, the time
argument of PALM_Get or PALM_Put should be set to the integer
generated by PALM_Time_convert.
The call sequence is
The constants defined by the user in
PrePALM are saved in the FORTRAN 90 module
contained in the palm_user_param.f90
file and in the FORTRAN 77 include
files which are created or updated at the generation of the PALM files. The
module can be "use associated" or the ".h" file included in the unit codes
in order to ensure coherence between the constants used in PrePALM and
the parameters used in the units.
Ppl keyword for an "" communication
sending an object to the buffer.
Keyword to use instead of the tag
argument in a call to , to designate all the
objects with the prescribed name and time no matter what the actual tag is (it
works as a jolly keyword for the "user tag" field). The first
object with the right name and time which has been or will be produced
is received.
Keyword to use instead of the time stamp
argument in a call to , to designate all the
objects with the prescribed name and tag no matter what the actual time stamp is (it
works as a jolly keyword for the "time stamp" field). The first
object with the right name and tag which has been or will be produced
is received.
Keyword for the definition of the space element size.
Private MPI communicator of each parallel unit. It is provided by PALM via the module or the
include file and it has to be used instead of MPI_COMM_WORLD.
Keyword for the definition of the space element size.
Keyword to use instead of the tag
argument in a call to , or to to indicate that the
object of the communication has no specific
tag assigned by the user.
Keyword to use instead of the time stamp
argument in a call to , or to to indicate that the
object of the communication has no time dependency.
An object stored in the buffer can be
recovered by a unit issuing a get only if its
state is set to "ready". This is normally the case of the objects
"inserted" in the buffer ("Insert/Replace" attribute, PL_INS keyword), meanwhile objects assembled in the
buffer via "Add" communications (PL_ADD keyword) or objects explicitly set to not ready need to have their state changed by
a step-actions instruction before being received.
The reference date for "date to integer" conversion. (cf. the entry).
The reference time step for "date to integer" conversion. Can be
"day", "hour", "minute", or "second". (cf. the entry).
The action to change the parallel distribution of an object. If an exchanged object has a different
distribution on the source and on the target side of a communication, PALM automatically takes
care of the redistribution following the optimal communication pattern.
The description of the properties of the objects belonging to a unit is based to a class association. A class of
objects is said to be a space. Spaces are local to units and therefore
they are described in the identity
cards of the units. The only exception is represented by the
spaces associated at run-time to the algebra toolbox units which are therefore defined
in the PrePALM interface. A space is
identified by its mnemonic name
(string), its shape
(with the FORTRAN syntax (dim1, dim2, ...)) and its
element size, i.e. the variable kindw of
a single element : in the case of standard FORTRAN types, the element
size is simply associated to the size of a standard type via the
predefined PL_INTEGER, PL_REAL, PL_DOUBLE_PRECISION, PL_COMPLEX,
PL_LOGICAL, PL_CHARACTER keywords ; in the case of a derived
type, the element size is defined as a linear combination of the
previous keywords.
Notice that in the definition of the shape and of the element size (in
the case of derived data types) the constants defined via PrePALM can be used.
Parallel programming pradigm in which the parallel applications
consists in a single program executing on more than one process. The
different instances of the program can perform different tasks or
handle different portions of data. The acronym means
Single Program Multiple Data.
Action of beginning a new branch execution. An idle
processor is allocated to the branch. Branches with
attribute are automatically started by the
driver since the beginning of the application. Branches with the
attribute have to be explicitly started from
another branch.
Attribute of a branch which runs since
the beginning of the application. (cf. the entry).
A relevant point where to perform some actions on
buffer objects.
Steps can be defined as communications occurrences
or as explicit invocations of the primitive in the branch
codes. The actions to perform when a step is reached are described in
the section of
the file or via the PrePALM interface.
The actions to be performed on the
objects stored in the buffer when the
algorithm reach some predefined points, the so-called steps. The steps
can be defined as occurrences of communications or as invocation of the
specific primitive in
the branch codes. The actions can be:
nullify an object in the buffer (make it "not ready" and set its
value to zero)
The syntax of the step-actions language is thoroughly described in the
section about the PALM actions.
A way for a branch to notify that a
relevant step of the computation has been reached.
Useful to synchronize branches (cf. the entry) or to trigger
actions described as step-actions. It has
to be declared in the PrePALM interface
and invoked by the branch codes.
Distributed unit for which the execution context is set by the user via PrePALM. It means that
the unit "sticks" to the processors of its first execution context.
Notice that this strategy puts strong constraints on
the dynamic allocation of resources and can lead to a deadlock of the
application for lack of resources.
A user defined integer to
distinguish between different objects with the same
"name" and "time". An object for which it is not necessary
to operate such a distinction has tag PL_NO_TAG.
The script and graphical language on which the PrePALM interface
is based. You need to have Tcl/Tk installed to run PrePALM. Tcl/Tk can
be freely downloaded from the Scriptics Web site http://dev.scriptics.com/software/tcltk
The time stamp is an integer which
indicates the time, in the time reference of the experience,
associated to an object. PALM provides a
way to associate in a unique way an integer to a GMT date via the PALM_Time_convert primitive.
In the case of communications
whose source is the buffer, the target
can ask to get objects at times for which they have not been actually
stored in the buffer. (For example is is the case of a tangent linear model
recovering at each time step the reference state, even if it has been
stored only every n time steps). The response of PALM to such a
request depends on the time interpolation scheme selected for the
communication. It can be
PL_GET_EXACT No interpolation has to be operated. If the required
time is not (and will not be) stored in the buffer, PALM_Get fails with non zero error code.
PL_GET_NNBOR Nearest Neighbor interpolation. If the required
time is not (and will not be) stored in the buffer, the closest time
will be selected instead. When PALM_Get returns, the "time" argument
contains the time stamp actually selected. It can be converted back to
a date with a call to .
PL_GET_LINEAR Linear interpolation. If the required
time is not (and will not be) stored in the buffer, PALM linearly
interpolates the two closest neighbors.
PL_GET_CUSTOM User defined interpolation. If the required
time is not (and will not be) stored in the buffer, PALM interpolates
the two closest neighbors using a rule provided by the user in the palm_time_int.f90 file
An algebra unit needs some information
about the time stamp and the user
tag of the objects it is going to send and to receive. For each
object linked to something via a communication, the user must indicate how
the algebra unit knows these attributes. The choice of the is amongst
the user set the value as an integer, a constant or a predefined value
, ,
, (the last two only
for input objects).
the user tells the algebra unit that it
will receive the information via a communication. An input plug is
automatically generated on the receiver region.
the information is computed using
basic arithmetic operations on the other values. They have to be
referenced by their preceded by the $ symbol.
According to the selected verbosity level, it is possible to track
the communications. It means that
some information on the ongoing communications are printed in the
standard output files. Since the
amount of information printed for all the communications would be
overwhelming, it is possible to set an attribute ( or
) to indicate whether the communication has to
be tracked or not.
The automatic launching strategy of algebra units is based on the availability of
necessary input. The user must indicate at least a triggering input
object for an algebra unit, but he can indicate more. The unit is not
launched until all the triggering inputs are received or are available
in the buffer.
A PALM communication takes place only
if a possible pattern
between a and a has been defined. We call such a
pattern a . All possible tubes are defined in the description.
It is important to understand that a tube simply defines a
potential data exchange pattern, but that there is no obligation to
actually perform the corresponding communication. For this reason it
is mnemonically associated to a tube line: even if the line exists, no
passengers are transported unless a metro car travels on the line.
Since tubes and communications are strictly related, in the PrePALM interface the word "communication" is used
to designate "communication tubes".
PALM can issue optional run-time control messages. The verbosity
control is based on several levels (,
and
plus some finer levels,
mainly for development and debugging purposes). The verbosity level
can be separately set for different categories of messages (cf. the
section on the diagnostic tools for
a thorough explanation). Moreover it is possible to set a maximum
overall verbosity level which applies to all categories and overrides
the specific levels if they are higher than the overall level.
The default overall and specific levels are set via the PrePALM interface. These values can be
changed at run-time using the , , ,
and
primitives.