Resource efficient finite element computing on multicore architectures

S. P. Kopysov; I. R. Kadyrov; A. K. Novikov

S. P. Kopysov ; I. R. Kadyrov ; A. K. Novikov

Izvestiya Instituta Matematiki i Informatiki Udmurtskogo Gosudarstvennogo Universiteta, Tome 53 (2019), pp. 83-97 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

Résumé

In this paper, we consider the construction of efficient finite element algorithms on three-dimensional unstructured meshes that take into account the complex parallel synchronization processes, the memory distribution problems and data storage. A layer-by-layer partitioning of the meshes into subdomains without branching internal boundaries is proposed to simplify the access to independent data and parallel computing at different stages of the finite element problem solving on unstructured meshes in multiply connected domains. The predictive capacity of the time efficiency and resource intensity for the proposed algorithmic solutions is analyzed. The analysis of the resource efficiency of the algorithms is given for the element-by-element scheme for forming and solving the system of linear algebraic equations of the finite element method. It is shown that the low arithmetic intensity of the algorithms considered results in the fact that their performance is limited by the bandwidth of the memory subsystem rather than by the processors'performance. The graphic memory has a larger bandwidth than the random-access memory. This allows a significant increase in the performance of the algorithm on GPU.

Keywords: finite element methods, parallel computing, partitioning mesh, Reeb graph, arithmetic intensity, universal scalability model.

@article{IIMI_2019_53_a7,
     author = {S. P. Kopysov and I. R. Kadyrov and A. K. Novikov},
     title = {Resource efficient finite element computing on multicore architectures},
     journal = {Izvestiya Instituta Matematiki i Informatiki Udmurtskogo Gosudarstvennogo Universiteta},
     pages = {83--97},
     year = {2019},
     volume = {53},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/IIMI_2019_53_a7/}
}

TY  - JOUR
AU  - S. P. Kopysov
AU  - I. R. Kadyrov
AU  - A. K. Novikov
TI  - Resource efficient finite element computing on multicore architectures
JO  - Izvestiya Instituta Matematiki i Informatiki Udmurtskogo Gosudarstvennogo Universiteta
PY  - 2019
SP  - 83
EP  - 97
VL  - 53
UR  - http://geodesic.mathdoc.fr/item/IIMI_2019_53_a7/
LA  - ru
ID  - IIMI_2019_53_a7
ER  -

%0 Journal Article
%A S. P. Kopysov
%A I. R. Kadyrov
%A A. K. Novikov
%T Resource efficient finite element computing on multicore architectures
%J Izvestiya Instituta Matematiki i Informatiki Udmurtskogo Gosudarstvennogo Universiteta
%D 2019
%P 83-97
%V 53
%U http://geodesic.mathdoc.fr/item/IIMI_2019_53_a7/
%G ru
%F IIMI_2019_53_a7

S. P. Kopysov; I. R. Kadyrov; A. K. Novikov. Resource efficient finite element computing on multicore architectures. Izvestiya Instituta Matematiki i Informatiki Udmurtskogo Gosudarstvennogo Universiteta, Tome 53 (2019), pp. 83-97. http://geodesic.mathdoc.fr/item/IIMI_2019_53_a7/

Bibliographie
Cité par

[1] Kopysov S. P., Novikov A. K., Ponomarev A. B., Rychkov V. N., Sagdeeva Yu. A., “A program environment for construction of computational models for parallel distributed computing”, Informatsionnye Tekhnologii, 2008, no. 3, 75–82 (in Russian)

[2] Kopysov S. P., Novikov A. K., Nedozhogin N. S., Karavaev A. S., “Layer-by-layer ordering of cells for problems of partitioning, mapping and parallel computing with conflict-free access on unstructured grids”, Parallel computational technologies (PCT)'2017, Proceedings of Int. Conf. (Kazan Federal University, Kazan, 2017), South Ural State University, Chelyabinsk, 386–398 (in Russian) http://omega.sp.susu.ru/pavt2017/short/044.pdf

[3] Kadyrov I. R., Kopysov S. P., Novikov A. K., “Partitioning of triangulated multiply connected domain into subdomains without branching of inner boundaries”, Uchenye Zapiski Kazanskogo Universiteta. Ser. Fiziko-Matematicheskie Nauki, 160, no. 3, 2018, 544–560 (in Russian)

[4] Kopysov S. P., Novikov A. K., “Domain decomposition for parallel adaptive finite element algorithm”, Vestnik Udmurtskogo Universiteta. Matematika. Mekhanika. Komp'yuternye Nauki, 2010, no. 3, 141–154 (in Russian) | DOI

[5] Komatitsch D., Michea D., Erlebacher G., “Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA”, Journal of Parallel and Distributed Computing, 69:5 (2009), 451–460 | DOI

[6] Kadyrov I. R., Kopysov S. P., Novikov A. K., “Partitioning of an arbitrary domain into subdomains without branching of inner boundaries”, Journal of Physics: Conference Series, 1158:3 (2019), 032001 | DOI

[7] Kadyrov I. R., Kopysov S. P., Novikov A. K., “Parallel partitioning without branching of inner boundaries for arbitrary domain”, Proceedings of the 4th Ural Workshop on Parallel, Distributed, and Cloud Computing for Young Scientists, Ural-PDC 2018 (Yekaterinburg, Russia), CEUR Workshop Proceedings, 2281, 60–66 http://ceur-ws.org/Vol-2281/

[8] Novikov A., Piminova N., Kopysov S., Sagdeeva Yu., “Layer-by-layer partitioning of finite-element meshes for multi-core architectures”, Communications in Computer and Information Science, Springer, Cham, 2016, 106–117 | DOI

[9] Kopysov S. P., Novikov A. K., Decomposition methods: meshes partitioning, Udmurt State University, Izhevsk, 2018

[10] Postnikov M. M., Introduction to Morse theory, Nauka, M., 1971 | MR

[11] Ivanov A. O., Tuzhilin A. A., Fomenko A. T., “Computer modeling of curves and surfaces”, Journal of Mathematical Sciences, 172:5 (2011), 663–689 | DOI | MR | Zbl

[12] Karypis G., Kumar V., “Multilevel $k$-way partitioning scheme for irregular graphs”, Journal of Parallel and Distributed Computing, 48:1 (1998), 96–129 | DOI | MR

[13] Gunther N. J., Puglia P., Tomasette K., “Hadoop superlinear scalability”, Communications of the ACM, 58:4 (2015), 46–55 | DOI

[14] Dennis J. E. (Jr.), Gay D. M., Welsch R. E., “Algorithm 573: NL2SOL — An adaptive nonlinear least-squares algorithm”, ACM Transactions on Mathematical Software, 7:3 (1981), 369–383 | DOI

[15] Williams S., Waterman A., Patterson D., “Roofline: an insightful visual performance model for multicore architectures”, Communications of the ACM, 52:4 (2009), 65–76 | DOI

Parcourir par

Geodesic

Parcourir par