Efficiency of classical molecular dynamics algorithms on supercomputing hardware
Matematičeskoe modelirovanie, Tome 28 (2016) no. 5, pp. 95-108.

Voir la notice de l'article provenant de la source Math-Net.Ru

Development of new HPC architectures proceeds faster than the corresponding adjustment of the algorithms for such fundamental mathematical models as classical molecular dynamics. The wide variety of choices poses the requirement of clear guiding criteria for the computational efficiency of a particular model on a particular hardware. LINPACK benchmark can no longer serve this role. In this work we consider a practical metrics of the time-to-solution versus the computational peak performance. In this metrics we compare different hardware (both legacy and modern) on the example of the LAMMPS software packages widely used for atomistic modeling. We show that the metrics considered can serve as a universal unambiguous scale ranging different combinations of CPUs, accelerators and interconnects.
Keywords: atomistic modeling, accelerators, peak performance.
Mots-clés : CPU architecture
@article{MM_2016_28_5_a6,
     author = {Grigory S. Smirnov and Vladimir V. Stegailov},
     title = {Efficiency of classical molecular dynamics algorithms on supercomputing hardware},
     journal = {Matemati\v{c}eskoe modelirovanie},
     pages = {95--108},
     publisher = {mathdoc},
     volume = {28},
     number = {5},
     year = {2016},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/MM_2016_28_5_a6/}
}
TY  - JOUR
AU  - Grigory S. Smirnov
AU  - Vladimir V. Stegailov
TI  - Efficiency of classical molecular dynamics algorithms on supercomputing hardware
JO  - Matematičeskoe modelirovanie
PY  - 2016
SP  - 95
EP  - 108
VL  - 28
IS  - 5
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/MM_2016_28_5_a6/
LA  - ru
ID  - MM_2016_28_5_a6
ER  - 
%0 Journal Article
%A Grigory S. Smirnov
%A Vladimir V. Stegailov
%T Efficiency of classical molecular dynamics algorithms on supercomputing hardware
%J Matematičeskoe modelirovanie
%D 2016
%P 95-108
%V 28
%N 5
%I mathdoc
%U http://geodesic.mathdoc.fr/item/MM_2016_28_5_a6/
%G ru
%F MM_2016_28_5_a6
Grigory S. Smirnov; Vladimir V. Stegailov. Efficiency of classical molecular dynamics algorithms on supercomputing hardware. Matematičeskoe modelirovanie, Tome 28 (2016) no. 5, pp. 95-108. http://geodesic.mathdoc.fr/item/MM_2016_28_5_a6/

[1] Spisok 500 luchshikh superkompiuterov (data obrashcheniia: 16.04.2015)

[2] Vanderbauwhede W., Benkrid K. (eds.), High-Performance Computing Using FPGAs, Springer Verlag, New York, 2013, 803 pp.

[3] I. Ohmura, G. Morimoto, Y. Ohno, A. Hasegawa, M. Taiji, “MDGRAPE-4: a special-purpose computer system for molecular dynamics simulations”, Phil. Trans. R Soc. A, 372 (2014), 20130387 | DOI | MR

[4] D. E. Shaw et al., “Anton 2: Raising the Bar for Performance and Programmability in a Special-Purpose Molecular Dynamics Supercomputer”, SC14 Int. Conf. High Perform. Comput. Networking, Storage Anal., IEEE Press, Piscataway, 2014, 41–53 | DOI

[5] V. O. Podryga, S. V. Polyakov, “Molekulyarno-dinamicheskoe modelirovanie ustanovleniya termodinamicheskogo ravnovesiya v nikele”, Matematicheskoe modelirovanie, 27:3 (2015), 3–19 | MR | Zbl

[6] V. O. Podryga, S. V. Poliakov, D. V. Puzyrkov, “Superkomputernoe molekuliarnoe modelirovanie termodinamicheskogo ravnovesiia v microsisitemakh gaz-metall”, Vychislitelnye metody i programmirovanie, 16:1 (2015), 123–138

[7] W. Eckhardt et al., “591 TFLOPS multi-trillion particles simulation on SuperMUC”, Supercomputing, 7905 (2013), 473

[8] V. V. Stegailov, G. E. Norman, “Problemy razvitiia superkomputernoi otrasli v Rossii: vzgliad polzovatelia vysokoproizvoditelnykh sistem”, Programmnye sistemy: teoriia i prilozheniia, 5:1(19) (2014), 111–152

[9] A. Y. Kuksin, A. V. Lankin, I. V. Morozov, G. E. Norman, N. D. Orekhov, V. V. Pisarev, G. S. Smirnov, S. V. Starikov, V. V. Stegailov, A. V. Timofeev, “Zachem i kakie nuzhny superkomputery eksaflopsnogo klassa? Predskazatelnoe modelirovanie svoistv i mnogomasshtabnykx protsessov v materialovedenii”, Programmnye sistemy: teoriia i prilozheniia, 5:1(19) (2014), 191–244 | MR

[10] E. M. Pestryaev, “Testirovanie mnogoiadernykh graficheskikh protsessorov na algoritme molekuliarnoi dinamiki”, Matematicheskoe Modelirovanie, 26:1 (2014), 69–82 | Zbl

[11] F. Reid, I. Bethune, Optimising CP2K for the Intel Xeon Phi, (data obrascheniya: 16.04.2015) http://www.praceri.eu/IMG/pdf/wp140.pdf

[12] H. Jeong et al., Performance of Kepler GTX Titan GPUs and Xeon Phi System, arXiv: (data obrascheniya: 16.04.2015) 1311.0590

[13] E. Y. K. Chan, Benchmarks for Intel MIC Architecture, (data obrascheniya: 16.04.2015) http://www.clustertech.com/wp-content/uploads/2014/01/MICBenchmark.pdf

[14] (data obrascheniya: 16.04.2015) http://www.nvidia.ru/object/gpu-computing-facts-ru.html

[15] S. J. Pennycook et al., “Exploring SIMD for molecular dynamics, using Intel Xeon processors and Intel Xeon Phi coprocessors”, IPDPS-13 Proceedings (2013), 1338

[16] (data obrascheniya: 16.04.2015) http://www.nvidia.com/object/io_1258360868914.html

[17] (data obrascheniya: 16.04.2015) http://www.anandtech.com/show/7521/nvidia-launches-tesla-k40

[18] (data obrascheniya: 16.04.2015) http://ark.intel.com/products/75800/Intel-Xeon-Phi-Coprocessor-7120X-16GB-1_238-GHz-61-core

[19] J. D. McCalpin, “Memory Bandwidth and Machine Balance in Current High Performance Computers”, IEEE Comput. Soc. Tech. Comm. Comput. Archit. Newsl., 1995, 19–25

[20] (data obrascheniya: 16.04.2015) http://www.cs.virginia.edu/stream/

[21] (data obrascheniya: 16.04.2015) https://devtalk.nvidia.com/default/topic/381934/stream-benchmark/

[22] (data obrascheniya: 16.04.2015) http://devblogs.nvidia.com/parallelforall/optimizing-high-performance-conjugate-gradient-benchmark-gpus/

[23] (data obrascheniya: 16.04.2015) http://www.cs.virginia.edu/stream/stream_mail/2013/0002.html

[24] Paket programm LAMMPS, (data obrascheniya: 16.04.2015) http://lammps.sandia.gov

[25] S. Plimpton, “Fast Parallel Algorithms for Short-Range Molecular Dynamics”, J. Comput. Phys., 117:1 (1995), 1–19 | DOI | Zbl

[26] W. M. Brown et al., “Implementing molecular dynamics on hybrid high performance computers-short range forces”, Comput. Phys. Commun., 182:4 (2011), 898–911 | DOI | Zbl

[27] E. H. Carter, C. R. Trott, D. Sunderland, “Kokkos: Enabling manycore performance portability through polymorphic memory access patterns”, J. Parallel Distrib. Comput., 74:12 (2013), 3202–3216 | DOI

[28] I. V. Morozov, A. M. Kazennov, R. G. Bystryi, G. E. Norman, V. V. Pisarev, V. V. Stegailov, “Molecular dynamics simulations of the relaxation processes in the condensed matter on GPUs”, Comput. Phys. Commun., 182:9 (2011), 1974–1978 | DOI

[29] W. M. Brown, A. Kohlmeyer, S. J. Plimpton, A. N. Tharrington, “Implementing molecular dynamics on hybrid high performance computers — Particle-particle particle-mesh”, Comput. Phys. Commun., 183:3 (2012), 449–459 | DOI

[30] W. M. Brown, M. Yamada, “Implementing molecular dynamics on hybrid high performance computers — Three-body potentials”, Comput. Phys. Commun., 184:12 (2013), 2785–2793 | DOI | MR

[31] C. Begau, G. Sutmann, “Adaptive dynamic load-balancing with irregular domain decomposition for particle simulations”, Comput. Phys. Commun., 190 (2015), 51–61 | DOI

[32] Vl. V. Voevodin, S. A. Zhumatii, S. I. Sobolev, A. S. Antonov, P. A. Bryzgalov, D. A. Nikitenko, K. S. Stefanov, Vad. V. Voevodin, “Praktika superkomputera “Lomonosov””, Otkrytye sistemy, 7 (2012), 36–39 | MR

[33] http://lammps.sandia.gov/bench/lj_one.html

[34] Y. Sun, G. Zheng, C. Mei, E. J. Bohm, J. C. Phillips, L. V. Kale, T. R. Jones, “Optimizing fine-grained communication in a biomolecular simulation application on Cray XK6”, Proc. Int. Conf. High Perform. Comput. Networking, Storage Anal. (2012)

[35] S. Kumar, Y. Sun, L. V. Kale, “Acceleration of an asynchronous message driven programming paradigm on IBM Blue Gene/Q”, IPDPS-13 Proceedings (2013), 1338