Best Practice Guide 1: Preprocessing
Type
Best practice
Description
This document summarizes the lessons learnt during the technical developments of Workpackage 3 during the EXCELLERAT project. It is written by the main code developers with the aim of providing recommendations and practical experience gained during development and testing to achieve the objectives of the project. It covers activities related to data transfer, mesh adaptation, performance engineering and load balancing, parallel solvers and testing on emerging technologies. For the data transfer to the High-Performance Computer (HPC), a software tool was created that allows external users to create jobs and upload their data as well as download their results. The jobs are inserted into the queue of an HPC machine and will be executed using secure communications based on HTTPS and TLS protocols.
Substantial effort has been dedicated to developing linear solvers that can achieve high parallel efficiency, since the solution by direct solvers is usually not valid due to the high amount of memory requirements, so iterative solvers are commonly used. In TPLS, iterative solvers from PETSc are used, CODA uses the Split solver, while Alya has developed cache-aware sparsity patterns for the factorized sparse approximate inverse preconditioner for the Conjugate Gradient method. Dynamic Mesh Adaptation has been a key feature in EXCELLERAT and different strategies have been addressed in the project from the different codes: Alya, AVBP, Nek5000 and m-AIA. Difficulties were shown to arise during the development of parallel AMR techniques due to the complexity of achieving an optimal re-meshing and efficient load balancing algorithms. General recommendations and benchmark results from the EXCELLERAT codes are given for future comparisons with other strategies. Another active area of development of the flagship codes was performance engineering, where different strategies for load balancing for node- and system-levels were explored by the developers. Porting to emerging technologies was also shown to bring challenges for the codes in terms of maintaining the computational performance and ensuring reproducibility of the results. These results can be considered state-of-the-art in terms of petascale code developments towards exascale and can be used for benchmarking and assessment of future capabilities in HPC codes for computational fluids dynamics (CFD). The results presented here clearly show the fundamental challenges that state-of-the-art petascale codes must address in order to achieve high performance and efficiency in future exascale architectures. General recommendations and lessons learnt have been provided for future developments.
License
Public