Annual Report


Previous Up

3.3  Software engineering

3.3.1  Open MP (F. Loercher, S. Champagneux, L. Giraud)

A possible parallelisation with OpenMP in elsA has two major advantages to the existing MPI parallelisation:
  • The number of processors used could be chosen independantly of the number of blocks.
  • With less blocks used, the global numerical behaviour improves.
In this work a strategy to parallelize elsA efficiently with OpenMP has been elaborated in collaboration with the "Parallel Algorithms" team. Some of the most CPU-time consuming functions (the Lussor-functions and the OperGradIntGF.for) have already been parallelized. Fig. 3.16 shows the acceleration for the LussorSca5 on the Compaq Alpha server of CERFACS as a function of meshsize.


3.3.2  Management and support (M. Montagnac, J.-F. Boussuge, S. Champagneux)

Tasks related to software management and code engineering are of the primary importance in both a research and industrial working environnement. CERFACS' industrial partners require high turnaround time response, reliability and robustness among many other aspects. Furthermore, CERFACS researchers also ask for simplicity in coding, for code clarity and for a highly-tunable code.

Those requests are reflected in the activities of the aerodynamics group. The elsA software comes along with procedures to enhance productivity in a multi-user and multi-platform environment: validation database, unitary test cases, cvs management tools, software quality program, documentation, training.

Common works include the development of new features, the re-engineering of designs, the improvement of verification and validation databases, the contribution to debugging and to quality reviews and the writing of user's, developer's and theoretical manuals.

Portability tests, optimization and benchmarking actions are also frequent activities to ensure the reliability and the efficiency of the code and to enable smooth transitions whenever industrial partners renew their computing facilities.

Finally, researchers at CERFACS can take advantage of the industrial environment delivered by Airbus and installed by the team members on CERFACS computers. That enables a real synergy between the two partners.



Figure 3.16: Speedup of LussorSca5 on 4 processors of the Compaq Alpha.






Figure 3.17: CPU time of the function LussorSca before and after optimisation.



3.3.3  Parallelism with MPI (M. Montagnac, J.-F. Boussuge, S. Champagneux)

Since 2002, CERFACS is involved in an ONERA project called ParelsA to parallelize elsA. Unitary test cases were developed to check the implementation of the MPI message passing library so as to ensure the portability of the code. A re-engineering of some aspects in the design has been proposed to reduce the number of synchronous communications and the size of messages. A first insight of that work can be seen in the following numbers. In a four processor calculation with 2 domains in each one, the speedup went from 2.86 in the initial version of the code to 3.55 in an enhanced version. Even better results are expected in the near future. Asynchronous communications are also under investigations.

Those high performance computing activities are carried out with the CERFACS computing facilities but also with supercomputers from the French national center CINES.

3.3.4  Code performance (M. Montagnac, J.-F. Boussuge, S. Champagneux)

As part of CERFACS's continuous effort for performance optimisation of its softwares, specific actions have been conducted. For instance, CERFACS has participated to a benchmark between several European Multi-block codes (NSMB, FLOWer, RANS-MB, etc ...) during automn and winter 2002 in the framework of a rationalisation process conducted within Airbus. CERFACS has contributed to gain an averaged speed-up of three for elsA for a relevant range of industrial configurations on vector architectures such as NEC SX series and FUJITSU VPP series. This has finally lead to the selection of the elsA software as the single MB structured simulation tool across all Airbus sites in Europe. CERFACS is now focusing on SMP's architectures.
Most of the latest high-performance computers have a superscalar architecture. As elsA was initially optimized for vector-computers, a significant gain of performance can be achieved by optimizing the code for scalar machines without changing the numerical behaviour.
In collaboration with the "Parallel Algorithms" team and SMP vendors such as IBM, some optimisation strategies have been examinated in the context of elsA: changing of array-structure, blocking, prefetch Stream optimisation, and changing of loop order. The appearance of significant CPU-time peaks has been explained and methods to avoid them have been found. Fig. 3.17 shows the impact of the optimisation for a very CPU time consuming function, LussorSca.


   Return to TOP
webmaster@cerfacs.fr
Last Updated: Apr 11, 2008
Copyright  © 2002-2007 CERFACS   
All rights reserved.