Guillaume Puigt

Senior researcher, PhD in Applied Mathematics, Accreditation to supervise research

CERFACS
CFD Team
42, avenue Gaspard Coriolis
31057 Toulouse Cedex, France

Mail: guillaume.puigt(at)cerfacs.fr
Phone: +33 5 61 19 30 94
Fax: +33 5 61 19 30 30

JAGUAR is a new CFD code that is currently developed at Cerfacs. JAGUAR is designed for the analysis of high order schemes (convection and diffusion) on high order meshes. Details on the solver are described in the following sections.

Introduction


Developing accurate schemes depends highly on the mesh chosen for the computations. Even if classical approaches (implicit compact, Finite Difference, WENO... schemes) are easy to implement in a structured framework, the situation is much more complex for unstructured grids. In particular, any approach with large stencil must be avoided for several reasons. First, computing large stencil needs complex geometric algorithms to implement and to validate and specific situations can be encountered, which may lead to implement correction(s). In particular, such a situation generally occurs when the mesh is composed of elements with different shapes. Moreover, large stencil and High Performance Computing (HPC) are two topics with opposite objectives. It is clear that many fields must be exchanged when a large stencil is considered and therefore messages have large length. The sequential efficiency is also impacted and if data are badly arranged in memory, cache misses lead to a high CPU time per degree of freedom. Finally, the efficiency of such codes is rather connected with memory bandwidth and MPI efficiency (of the supercomputer) than with the kernel itself. But unstructured grids are unavoidable for computations over complex CAD and a lot of work has been published during the last years on high order schemes for unstructured grids.

A way to overcome large stencils on unstructured grids lies on increasing the number of degrees of freedom inside the element. It is convenient to define high order representations of quantities inside any mesh element following a polynomial approximation. The reconstructed variables at the faces depend on the mesh element only and two different extrapolations (one at left side and one at right one) may lead to discontinuous flow at mesh faces. Among the methods proposed in the literature, we have identified three techniques:

All techniques define high-order continuous solution inside each mesh element and since the reconstruction leads to two different quantities at mesh interface, a Riemann solver is necessary to compute the flux to exchange between cells. The interest of all methods comes also from the possibility to manage both the space refinement parameter h and the degree of the polynomial p. When one compares classical Finite Volume technique with those high-order ones, the main difference lies in the non-universal relation between the mesh element and the number of degrees of freedom (1 mesh element is not associated with 1 degree of freedom as in Finite Volume). This point has a (very) strong impact on the high order unstructured data structure.

JAGUAR


For the AAM team, H. Deniau, J-F. Boussuge and myself have initiated the development of a new CFD code called JAGUAR (proJect of an Aerodynamic solver using General Unstructured grids And high ordeR schemes). We have decided to focus our attention on the Spectral Difference method. This choice is motivated by several reasons: the SD method has been built in order to correct some drawbacks of DG and SV. First, it seems more efficient in term of CPU usage (less computations per degree of freedom) than DG technique. Moreover, SV suffers a high sensitivity with respect to element decomposition and this drawback is avoided with SD method. Finally, the SD approach is less mature and the potential of research work is greater.

Figure 1: Global algorithm for the SD method
Figure 2: Strong scalability on Airain depending on polynomial degree

JAGUAR structure has been (is ever currently) optimized in order to get the best computing efficiency on both classical and GPGPU architectures. JAGUAR is written in fortran90 and parallel computations can be performed with a MPI implementation and asynchronous communications. For massively parallel computations, an OpenMP paradigm is also implemented in order to limit the number of messages: a hybrid paradigm is considered with MPI communications between nodes and OpenMP paradigm inside each node. The GPGPU implementation is based on CUDA-Fortran compiler. Is it planned to extend the treatment soon to Intel MIC architectures. On Airain, we have shown a very good strong scalability with an almost perfectly linear scaling up to 2048 cores. On Neptune, we have considered a very small mesh and strong scalability has been conducted in order to check the lowest amount of data mandatory. We show on Fig. 3 that 80% of efficiency is attained with only 400 Degrees of Freedom per computational core. Finally, JAGUAR has been adapted to GPU. Fig. 4 represents the scalability curve for a number of GPU between 1 and 64 and for two meshes composed of 23 Million degrees of freedom. Computations have been performed on Curie supercomputer. In 3D, the efficiency is 47.0 / 64 and in 2D, it is 49.9 / 64. The same level of efficiency as published in the literature is obtained.

Figure 3: Global scalability with a very small mesh
Figure 4: Strong scalability on a cluster of tesla M2090 cards, each composed of 512 GPU cores

Recent results


The first application of JAGUAR concerns the COnvection of a VOrtex test case. The COVO test case is a basic test case for high order schemes dedicated to LES and aeroacoustics. The goal is to transport a vortex in a periodic box (of length 0.1m) and the vortex turns 10, 20 or more times in the domain. Here, we consider a homogeneous flow on which is superimposed the vortex. The vortex is centred in (Xc,Yc) and r represents the radius. The numerical parameters needed to define the initialisation are presented in Fig. 5.

Figure 5: Numerical parameters for the COVO case


Figure 6: Comparison of numerical solution with initialization for different polynomial order after 200 periods.


The flow is analysed after 200 periods. As seen on Fig. 6, numerical results are very close to the initial solution for a constant number of Degrees of Freedom (about 25 600 DoF). The third order accurate scheme (p=2) can't keep the vortex after 200 rotations but the fifth order scheme does. Finally, we show on Fig. 7 and Fig. 8 the COVO after 50 rotations on a regular grid and an irregular one (using 25 600 DoF and p=4 - 5th order accurate scheme). The initialization can be kept. Note that Fig. 7 and Fig. 8 have been obtained with GMSH and the true polynomials are represented. For high order methods, post-treatment is a bottleneck since there are few tools able to handle high order polynomial data representation. Classical visualization assumes a linear behaviour between the degrees of freedom. To conclude, we present on Fig. 9 a comparison of the solution after 50 periods obtained on the regular grid or on the irregular one.
Finally, we show on Fig. 10 a new computation for a COVO transported at Mach 0.05 with Beta=1/50 and R=0.005m. This is just to show that the solution at a low order accuracy (here p=2 means third order) is really discontinuous.

Figure 7: Pressure variations after 50 periods on a regular Cartesian mesh (treated as unstructured).
Figure 8: Pressure variations after 50 periods on a fully unstructured grid composed of hexahedra.


Figure 9: x-velocity component comparison with exact solution after 50 rotations. Reg. means regular mesh and Irreg means irregular mesh, as shown on Fig. 7 and Fig. 8.


Figure 10: Slow vortex at Mach=0.05 after 50 rotations. With p=2, the solution is discontinuous and the discontinuity decreases as the polynomial order increases.


List of contributors and reports