2.5 Software engineering
Developing and using large LES and DNS codes requires specific efforts: the codes but also the associated pre and post-processing tools required to prepare LES runs and examine results make software engineering a critical task at CERFACS. Multiple actions took place in the last two years.
2.5.1 Source management (Y. Sommerer )
AVBP is developed jointly by CERFACS and Institut Francais du Petrole (IFP) and used by many European institutions. Multiple laboratories also bring specific sub models to the code. The source management is performed by CERFACS using CVS in order to manage this multi-sites and multi-developers environment. A software quality practice is established to minimize the development risks. Regular meetings take place between IFP and CERFACS to define the versions evolutions. Two levels of non-regression test cases are used:
-
CTEA (Automatic Elementary Test Cases) run on a weekly basis during development,
- QPF (Quality Program Form) performed for each new version (every nine months).
Another specific benchmark is run frequently on various computational architectures to verify both single-processor efficiency and parallel performances.
The AVBP documentation (User's Manual and Handbook) evolves in parallel with the source code and is available for the users via a web site maintained by CERFACS. This is a significant task for CERFACS: CTEA, QPFs and information management require more than 5 man years for the CFD team every year.
2.5.2 Source optimisation (Y. Sommerer , M. García , G. Staffelbach )
The previous sections have shown an increase of problems size and complexity of most LES at CERFACS. Simultaneously, the new generation of super computers opens the path for studies which were not possible up to now, such as ignition sequences or complete annular chambers computations. The recent evolution of massively parallel clusters (up to thousands processors) reinforces the importance of sustaining high performance levels on thousands of processors:
-
CPU and Message passing: optimization, using the most modern profiling tools, is done jointly with the CERFACS Parallel algorithm team to use the most efficient MPI functions and minimize CPU cost. This work is a key to reach linear speedups up to 5000 processors.
- Domain decomposition: the algorithm used for domain decomposition controls the speed up for massively parallel cases. AVBP is linked with the most modern decomposition domain algorithms in order to minimize the frontier interface between neighbors domains (i.e. minimize the point-to-point communications).
2.5.3 Frontier computations (Y. Sommerer , G. Staffelbach )
CERFACS collaborations with computer companies and computing centers allow to extend LES to 'frontier' simulations, thereby testing these super computers and anticipating the problems linked to very large computations in terms of memory, parallelism and pre/post-processing. Two application fields were used in 2005 for these tests:
-
Piston engines
- Full gas turbines (aeronautical and industrial)
Figure 2.22: LES of ignition in a helicopter gas turbine using jets of hot gases. Computation with AVBP on 2048 processors (BlueGene configuration).
High-resolution LES of turbulent flows in Diesel intake pipes (10 millions cells; 1024 processors) were performed on an IBM eServer Blue Gene made available to CERFACS by IBM. test cases data were provided by PSA. In DI engines aerodynamics play a key role: the design of intake pipes is crucial, requesting significant optimizations and especially with CFD. For such flows, classical turbulence methods lack for accuracy: the revolution introduced by Large Eddy Simulation (LES) methods in the last ten years, allows now a precise computation of the flows but the size of the models makes them impossible to be run on most of the computers with classical architectures.
In 2005, AVBP was ported on a BlueGene machine and LES was run on a high-resolution mesh, in order to compute a typical Diesel intake geometry (see Fig. 2.20).
Instantaneous velocity fields exhibit many structures on the valve jets and show that the high-resolution LES reveals flow features which were never computed before.
In the field of aeronautic gas turbines, a full combustion chamber was computed with LES on BlueGene and CINES computers (20 millions cells cases, 2048 processors IBM BG/L and 32 to 128 processors CINES SGI O3800 in a joint CINES/CERFACS project).
Computing combustion in a full combustion chamber had long been out of reach of Computational Fluid Dynamic tools. In 2005, AVBP was used on BlueGene on high-resolution meshes, in order to compute ignition and flame propagation in the combustor of an helicopter turboshaft engine from Turbomeca (Safran group). In this geometry (Fig. 2.22), all fuel injectors (18) and dilution jets (108) are included and a full ignition sequence starting from two igniters is computed.
Industrial gas turbine (40 millions cells; 1000 to 5120 processors IBM BG/L) were also computed as test cases for massively parallel machines. For such turbines, recent CERFACS LES studies show the importance of burner - burner interaction and azimuthal acoustic modes to accurately predict the flame stability. This requires full chamber computations with all burners (24 usually) and huge CPU capacities. A Siemens PG configuration was run on IBM BlueGene/L (Thomas Watson Research Center and Rochester respectively 2d and 22th in the 26th top500 list) and linear speedup up to 5000 processors were measured (Fig. 2.23). Typically, a speed up of 4078 on 4096 processors was obtained.
Figure 2.23: Speed-ups obtained with AVBP on BlueGene (Thomas Watson Research Center)
2.5.4 Collaboration with French national computing centers (CINES and CEA) and computer companies (Y. Sommerer )
CERFACS continues to collaborate with French national computing centers CINES, IDRIS and CEA. Because of their excellent parallel scalability, AVBP and/or NTMIX are often used to benchmark machines for those three institutions in order to stress the whole configuration machine.
In parallel, a joint effort with constructors like IBM [1] or Cray as been done in 2005 to optimize AVBP on specific processors and interconnection networks: Power PC 440 - IBM BlueGene/L and Opteron - Cray XD1.
[1] IBM Red Book (Chapter 8.4) : http://www.redbooks.ibm.com/abstracts/sg246686.html?Open
|
|
|