Extension with OpenCL of the two-level MPI+OpenMP parallelization for CFD simulations on heterogeneous systems
Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie, no. 9 (2011), pp. 76-86 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

The paper is devoted to the extended parallelization for numerical experiments in fluid dynamics and aeroacoustics on heterogeneous systems that combines computing units of different architectures, namely CPU and GPU. A hybrid two-level MPI+OpenMP parallel model is extended with OpenCL in order to engage GPUs. In doing so the third level of parallelism appears. A model of an algorithm for unstructured meshes is presented.
Mots-clés : CFD
Keywords: computational aeroacoustics, parallel computing, MPI, OpenMP, GPU, OpenCL.
@article{VYURU_2011_9_a7,
     author = {A. V. Gorobets and S. A. Soukov and A. O. Zheleznyakov and P. B. Bogdanov and B. N. Chetverushkin},
     title = {Extension with {OpenCL} of the two-level {MPI+OpenMP} parallelization for {CFD} simulations on heterogeneous systems},
     journal = {Vestnik \^U\v{z}no-Uralʹskogo gosudarstvennogo universiteta. Seri\^a, Matemati\v{c}eskoe modelirovanie i programmirovanie},
     pages = {76--86},
     year = {2011},
     number = {9},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/VYURU_2011_9_a7/}
}
TY  - JOUR
AU  - A. V. Gorobets
AU  - S. A. Soukov
AU  - A. O. Zheleznyakov
AU  - P. B. Bogdanov
AU  - B. N. Chetverushkin
TI  - Extension with OpenCL of the two-level MPI+OpenMP parallelization for CFD simulations on heterogeneous systems
JO  - Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie
PY  - 2011
SP  - 76
EP  - 86
IS  - 9
UR  - http://geodesic.mathdoc.fr/item/VYURU_2011_9_a7/
LA  - ru
ID  - VYURU_2011_9_a7
ER  - 
%0 Journal Article
%A A. V. Gorobets
%A S. A. Soukov
%A A. O. Zheleznyakov
%A P. B. Bogdanov
%A B. N. Chetverushkin
%T Extension with OpenCL of the two-level MPI+OpenMP parallelization for CFD simulations on heterogeneous systems
%J Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie
%D 2011
%P 76-86
%N 9
%U http://geodesic.mathdoc.fr/item/VYURU_2011_9_a7/
%G ru
%F VYURU_2011_9_a7
A. V. Gorobets; S. A. Soukov; A. O. Zheleznyakov; P. B. Bogdanov; B. N. Chetverushkin. Extension with OpenCL of the two-level MPI+OpenMP parallelization for CFD simulations on heterogeneous systems. Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie, no. 9 (2011), pp. 76-86. http://geodesic.mathdoc.fr/item/VYURU_2011_9_a7/

[1] R. Aubry, G. Houzeaux, M. Vazquez, J. M. Cela, “Some useful strategies for unstructured edge-based solvers on shared memory machines”, International J. for Numerical Methods in Engineering, 85 (2010), 537–561 | DOI

[2] K. Itakura, A. Uno, M. Yokokawa, T. Ishihara, Y. Kaneda, “Scalability of hybrid programming for a CFD code on the Earth Simulator”, Parallel Computing, 30:12 (2004), 1329–1343 | DOI

[3] K. Nakajima, “Three-level hybrid vs. flat MPI on the Earth Simulator: Parallel iterative solvers for finite-element method”, Applied Numerical Mathematics, 54:2 (2005), 237–255 | DOI | Zbl

[4] V. Heuveline, M. J. Krause, J. Latt, “Towards a hybrid parallelization of lattice Boltzmann methods”, Computers and Mathematics with Applications, 58:5 (2009), 1071–1080 | DOI | MR | Zbl

[5] M. J. Chorley, D. W. Walker, “Performance analysis of a hybrid MPI/OpenMP application on multi-core clusters”, J. of Computational Science, 1:3 (2010), 168–174 | DOI

[6] A. Monakov, A. Lokhmotov, A. Avetisyan, “Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures”, High Performance Embedded Architectures and Compilers, Lecture Notes in Computer Science, 5952, 2010, 111–125 | DOI

[7] L. Buatois, G. Caumon, B. Levy, “Concurrent number cruncher: a GPU implementation of a general sparse linear solver”, Int. J. Parallel Emerg. Distrib. Syst., 24:3 (2009), 205–223 | DOI | MR

[8] I. Abalakin, A. Dervieux, T. Kozubskaya, Computational Study of Mathematical Models for Noise DNS, AIAA 2002-2585

[9] I. Abalakin et al., Accuracy Improvement for Finite-Volume Vertex-Centered Schemes Solving Aeroacoustics Problems on Unstructured Meshes, AIAA 2010-3933

[10] The OpenCL Specification, Version 1.1, , Khronos OpenCL Working Group, 2010 (data obrascheniya: 11.06.2011) http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf

[11] AMD Accelerated Parallel Processing OpenCL Programming Guide, , Advanced Micro Devices, Inc., 2011 (data obrascheniya: 11.06.2011) http://developer.amd.com/sdks/AMDAPPSDK/assets/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide.pdf

[12] OpenCL Programming Guide for the CUDA Architecture, Version 2.3, , NVIDIA (data obrascheniya: 11.06.2011) http://developer.download.nvidia.com/compute/cuda/3_0/toolkit/docs/NVIDIA_OpenCL_ProgrammingGuide.pdf

[13] A Unified Runtime System for Heterogeneous Multicore Architectures, , INRIA RUNTIME team, 2010 (data obrascheniya: 11.06.2011) http://runtime.bordeaux.inria.fr/StarPU/

[14] E. Agullo et al., “Faster, Cheaper, Better — a Hybridization Methodology to Develop Linear Algebra Software for GPUs”, GPU Computing Gems, 2, Morgan Kaufmann, 2010, INRIA-00547847:1