Extension with OpenCL of the two-level MPI+OpenMP parallelization for CFD simulations on heterogeneous systems
Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie, no. 9 (2011), pp. 76-86

Voir la notice de l'article provenant de la source Math-Net.Ru

The paper is devoted to the extended parallelization for numerical experiments in fluid dynamics and aeroacoustics on heterogeneous systems that combines computing units of different architectures, namely CPU and GPU. A hybrid two-level MPI+OpenMP parallel model is extended with OpenCL in order to engage GPUs. In doing so the third level of parallelism appears. A model of an algorithm for unstructured meshes is presented.
Mots-clés : CFD
Keywords: computational aeroacoustics, parallel computing, MPI, OpenMP, GPU, OpenCL.
A. V. Gorobets; S. A. Soukov; A. O. Zheleznyakov; P. B. Bogdanov; B. N. Chetverushkin. Extension with OpenCL of the two-level MPI+OpenMP parallelization for CFD simulations on heterogeneous systems. Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie, no. 9 (2011), pp. 76-86. http://geodesic.mathdoc.fr/item/VYURU_2011_9_a7/
@article{VYURU_2011_9_a7,
     author = {A. V. Gorobets and S. A. Soukov and A. O. Zheleznyakov and P. B. Bogdanov and B. N. Chetverushkin},
     title = {Extension with {OpenCL} of the two-level {MPI+OpenMP} parallelization for {CFD} simulations on heterogeneous systems},
     journal = {Vestnik \^U\v{z}no-Uralʹskogo gosudarstvennogo universiteta. Seri\^a, Matemati\v{c}eskoe modelirovanie i programmirovanie},
     pages = {76--86},
     year = {2011},
     number = {9},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/VYURU_2011_9_a7/}
}
TY  - JOUR
AU  - A. V. Gorobets
AU  - S. A. Soukov
AU  - A. O. Zheleznyakov
AU  - P. B. Bogdanov
AU  - B. N. Chetverushkin
TI  - Extension with OpenCL of the two-level MPI+OpenMP parallelization for CFD simulations on heterogeneous systems
JO  - Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie
PY  - 2011
SP  - 76
EP  - 86
IS  - 9
UR  - http://geodesic.mathdoc.fr/item/VYURU_2011_9_a7/
LA  - ru
ID  - VYURU_2011_9_a7
ER  - 
%0 Journal Article
%A A. V. Gorobets
%A S. A. Soukov
%A A. O. Zheleznyakov
%A P. B. Bogdanov
%A B. N. Chetverushkin
%T Extension with OpenCL of the two-level MPI+OpenMP parallelization for CFD simulations on heterogeneous systems
%J Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie
%D 2011
%P 76-86
%N 9
%U http://geodesic.mathdoc.fr/item/VYURU_2011_9_a7/
%G ru
%F VYURU_2011_9_a7

[1] R. Aubry, G. Houzeaux, M. Vazquez, J. M. Cela, “Some useful strategies for unstructured edge-based solvers on shared memory machines”, International J. for Numerical Methods in Engineering, 85 (2010), 537–561 | DOI

[2] K. Itakura, A. Uno, M. Yokokawa, T. Ishihara, Y. Kaneda, “Scalability of hybrid programming for a CFD code on the Earth Simulator”, Parallel Computing, 30:12 (2004), 1329–1343 | DOI

[3] K. Nakajima, “Three-level hybrid vs. flat MPI on the Earth Simulator: Parallel iterative solvers for finite-element method”, Applied Numerical Mathematics, 54:2 (2005), 237–255 | DOI | Zbl

[4] V. Heuveline, M. J. Krause, J. Latt, “Towards a hybrid parallelization of lattice Boltzmann methods”, Computers and Mathematics with Applications, 58:5 (2009), 1071–1080 | DOI | MR | Zbl

[5] M. J. Chorley, D. W. Walker, “Performance analysis of a hybrid MPI/OpenMP application on multi-core clusters”, J. of Computational Science, 1:3 (2010), 168–174 | DOI

[6] A. Monakov, A. Lokhmotov, A. Avetisyan, “Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures”, High Performance Embedded Architectures and Compilers, Lecture Notes in Computer Science, 5952, 2010, 111–125 | DOI

[7] L. Buatois, G. Caumon, B. Levy, “Concurrent number cruncher: a GPU implementation of a general sparse linear solver”, Int. J. Parallel Emerg. Distrib. Syst., 24:3 (2009), 205–223 | DOI | MR

[8] I. Abalakin, A. Dervieux, T. Kozubskaya, Computational Study of Mathematical Models for Noise DNS, AIAA 2002-2585

[9] I. Abalakin et al., Accuracy Improvement for Finite-Volume Vertex-Centered Schemes Solving Aeroacoustics Problems on Unstructured Meshes, AIAA 2010-3933

[10] The OpenCL Specification, Version 1.1, , Khronos OpenCL Working Group, 2010 (data obrascheniya: 11.06.2011) http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf

[11] AMD Accelerated Parallel Processing OpenCL Programming Guide, , Advanced Micro Devices, Inc., 2011 (data obrascheniya: 11.06.2011) http://developer.amd.com/sdks/AMDAPPSDK/assets/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide.pdf

[12] OpenCL Programming Guide for the CUDA Architecture, Version 2.3, , NVIDIA (data obrascheniya: 11.06.2011) http://developer.download.nvidia.com/compute/cuda/3_0/toolkit/docs/NVIDIA_OpenCL_ProgrammingGuide.pdf

[13] A Unified Runtime System for Heterogeneous Multicore Architectures, , INRIA RUNTIME team, 2010 (data obrascheniya: 11.06.2011) http://runtime.bordeaux.inria.fr/StarPU/

[14] E. Agullo et al., “Faster, Cheaper, Better — a Hybridization Methodology to Develop Linear Algebra Software for GPUs”, GPU Computing Gems, 2, Morgan Kaufmann, 2010, INRIA-00547847:1