many-core.group

Computational Fluid Dynamics

Graham Pullan and Tobias Brandvik

Introduction

Computational Fluid Dynamics (CFD) is a broad discipline in which computers are used to calculate the flow around, or through, objects. In science, CFD is used in a diverse array of applications ranging from biology, through meteorology, to astrophysics. In engineering, CFD is applied to both external flows (around products: airplanes, cars, boats) and internal flows (through products: buildings, pipes). At the Whittle Laboratory in the Department of Engineering, CFD is used in the design and analysis of turbomachinery - jet-engines, gas turbines, steam turbines, pumps etc. In fact, turbomachinery was one of the first branches of engineering to make active use of CFD in design (in the 1970's) and the Whittle Laboratory has played a key role in the development of such codes during this time.

The aim of this project is to use many-core accelerators to provide a step change (10x to 100x speedup on a per-cost basis) in the run-times of CFD solvers. In the turbomachinery industry, this would enable:

  • solutions at current grid resolutions can be obtained on a human interactive timescale (seconds)
  • far more design candidate evaluated by automatic optimizers at current grid resolutions than is currently feasible
  • solutions on a current timescale (hours) but at much finer grid resolutions allowing turbulent flow structures to be calculated in a routine design process for the first time.

Implementation

[Iterating upwards in a sub-block in CUDA.] Figure 1: Iterating upwards in a sub-block in CUDA. One thread is used for each node in the plane.

[A 3D stencil.] Figure 2: A stencil using the nearest neighbours in three dimensions.

[Mach contours and comparison to experiments for a VKI McDonald transonic turbine blade.] Figure 3: Mach contours and comparison to experiments for a VKI McDonald transonic turbine blade.

For this project, an existing FORTRAN code is to be ported to a range of many-core accelerators. Here, we will focus on NVIDIA GPUs using the CUDA high level language. The code is a structured multi-block solver. This means that the volume occupied by the fluid is divided into 3D hexahedral blocks and each block comprises a structured 3D array (i,j,k) of data. The flow properties are calculated by applying the governing conservation equations (mass, momentum and energy) to each data point in the array. The same equations are applied many thousands of times to each point (one million points in a typical calculation) and the flow properties are adjusted iteratively until the final solution is found.

The data at each point are updated using data from the surrounding points - this is a stencil operation. In CUDA, such calculations can be parallelised across the many cores of a GPU by dividing each block in the grid up into smaller sub-blocks that can fit in the limited memory of each core. Modern NVIDIA GPUs typically have tens of these cores, each of which have 16 KB of on-chip memory. To apply the stencil operation to each node in a sub-block, we iterate upwards in the sub-block, each time loading in a new plane of data and discarding an old one. Since a GPU core can have many light-weight threads running at the same time, one thread is started for each node in the plane. Figure 1 shows how this procedure can be used to compute a stencil that uses the points shown in Figure 2.

Results

Figure 3 shows the result of a calculation of the flow around a VKI McDonald transonic turbine blade (Sieverding, 1976). On the left are contours of constant Mach number for the case of an inlet Mach number of 1.42; on the right is a comparison to experimental data for inlet Mach numbers of 1.42 and 1.06.

Publications

References

  • C. Sieverding (1976). Base Pressure in Supersonic Flow. In VKI LS: Transonic Flows in Turbomachinery.