Survey of parallel computation models

N. A. Ezhova; L. B. Sokolinskii

Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ Vyčislitelʹnaâ matematika i informatika, Tome 8 (2019) no. 3, pp. 58-91

Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

Résumé

This survey aims to present the state of the art in analytic parallel computation models, providing sufficiently detailed descriptions of particularly noteworthy efforts. Such models allow predicting the computation time, speedup, efficiency and scalability of parallel algorithms for various target multiprocessor platforms. Modeling the cost of computations and communications in multiprocessor systems is an important and challenging problem. It provides insights into the design of the parallel algorithms for optimization of their deployment in the increasingly complex high-performance computing. The survey shows the evolution of parallel computing models inspired by the evolution of multiprocessor systems, from single-level models with shared memory to multi-level hierarchical models with distributed memory, which correspond to multicore clusters. The review concludes with prospective directions for further research in the area of developing mathematical models for parallel computing.

Keywords: computing model, survey, parallel programming, multiprocessor systems, performance evaluation, execution time prediction.

@article{VYURV_2019_8_3_a3,
     author = {N. A. Ezhova and L. B. Sokolinskii},
     title = {Survey of parallel computation models},
     journal = {Vestnik \^U\v{z}no-Uralʹskogo gosudarstvennogo universiteta. Seri\^a Vy\v{c}islitelʹna\^a matematika i informatika},
     pages = {58--91},
     year = {2019},
     volume = {8},
     number = {3},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/VYURV_2019_8_3_a3/}
}

TY  - JOUR
AU  - N. A. Ezhova
AU  - L. B. Sokolinskii
TI  - Survey of parallel computation models
JO  - Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ Vyčislitelʹnaâ matematika i informatika
PY  - 2019
SP  - 58
EP  - 91
VL  - 8
IS  - 3
UR  - http://geodesic.mathdoc.fr/item/VYURV_2019_8_3_a3/
LA  - ru
ID  - VYURV_2019_8_3_a3
ER  -

%0 Journal Article
%A N. A. Ezhova
%A L. B. Sokolinskii
%T Survey of parallel computation models
%J Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ Vyčislitelʹnaâ matematika i informatika
%D 2019
%P 58-91
%V 8
%N 3
%U http://geodesic.mathdoc.fr/item/VYURV_2019_8_3_a3/
%G ru
%F VYURV_2019_8_3_a3

N. A. Ezhova; L. B. Sokolinskii. Survey of parallel computation models. Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ Vyčislitelʹnaâ matematika i informatika, Tome 8 (2019) no. 3, pp. 58-91. http://geodesic.mathdoc.fr/item/VYURV_2019_8_3_a3/

Bibliographie
Cité par

[1] Y. Zhang, et al., “Models of Parallel Computation: a Survey and Classification”, Frontiers of Computer Science in China, 1:2 (2007), 156–165, Higher Education Press | DOI

[2] L. G. Valiant, “A Bridging Model for Parallel Computation”, Communications of the ACM, 33:8 (1990), 103–111 | DOI

[3] D. K. G. Campbell, A Survey of Models of Parallel Computation, Technical Report No.YCS97-278, 1997, 37 pp.

[4] J. C. Shepherdson, H. E. Sturgis, “Computability of Recursive Functions”, Journal of the ACM, 10:2 (1963), 217–255, ACM | DOI

[5] C. C. Elgot, A. Robinson, “Random-Access Stored-Program Machines, an Approach to Programming Languages”, Journal of the ACM, 11:4 (1964), 365–399, ACM | DOI

[6] J. Hartmanis, “Computational Complexity of Random Access Stored Program Machines”, Mathematical Systems Theory, 5:3 (1971), 232–245, Springer-Verlag | DOI

[7] S. A. Cook, R. A. Reckhow, “Time Bounded Random Access Machines”, Journal of Computer and System Sciences, 7:4 (1973), 354–375, Academic Press | DOI

[8] A. V. Aho, J. E. Hopcroft, J. D. Ullman, The Design and Analysis of Computer Algorithms, Addison-Wesley, London, Amsterdam, Don Mills, Ontario, Sydney, 1974, 470 pp.

[9] D. B. Skillicorn, D. Talia, “Models and Languages for Parallel Computation”, ACM Computing Surveys, 30:2 (1998), 123–169 | DOI

[10] S. Fortune, J. Wyllie, “Parallelism in Random Access Machines”, Proceedings of the Tenth Annual ACM Symposium on Theory of Computing — STOC-78 (New York, New York, USA), ACM Press, 1978, 114–118 | DOI

[11] D. Culler, et al., “LogP: Towards a Realistic Model of Parallel Computation”, Proceedings of the Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming — PPOPP-93 (New York, New York, USA), ACM Press, 1993, 1–12 | DOI

[12] L. Yuan, et al., “LogGPH: A Parallel Computational Model with Hierarchical Communication Awareness”, Proceedings of the 2010 13th IEEE International Conference on Computational Science and Engineering, CSE-10, IEEE Computer Society, Washington, DC, US, 2010, 268–274 | DOI

[13] F. Lu, J. Song, Y. Pang, “HLognGP: A Parallel Computation Model for GPU clusters”, Concurrency and Computation: Practice and Experience, 27:17 (2015), 4880–4896 | DOI

[14] X. Qiao, S. Chen, L. T. Yang, “HPM: a Hierarchical Model for Parallel Computations”, International Journal of High Performance Computing and Networking, 1:1 (2004), 1–3 | DOI

[15] J. A. Rico-Gallego, J. C. Diaz-Martin, “t-Lop: Modeling Performance of Shared Memory MPI”, Parallel Computing. North-Holland, 46 (2015), 14–31 | DOI

[16] J. A. Rico-Gallego, A. L. Lastovetsky, J. C. Diaz-Martin, “Model-Based Estimation of the Communication Cost of Hybrid Data-Parallel Applications on Heterogeneous Clusters”, IEEE Transactions on Parallel and Distributed Systems, 28:11 (2017), 3215–3228 | DOI

[17] G. Bilardi, et al., “On the Effectiveness of D-BSP as a Bridging Model of Parallel Computation”, Proceedings of the International Conference on Computational Science — ICCS-01. Part II. Lecture Notes in Computer Science (Berlin, Heidelberg), 2001, 579–588 | DOI

[18] N. A. Ezhova, L. B. Sokolinsky, “Parallel Computational Model for multiprocessor Systems With Distributed Memory”, Series: Computational Mathematics and Software Engineering, 7:2 (2018), 32–49 | DOI

[19] N. A. Ezhova, L. B. Sokolinsky, “Scalability Evaluation of Iterative Algorithms for Supercomputer Simulation of Physical Processes”, Numerical methods and programming, 2018, no. 4, 416–430 | DOI | DOI

[20] L. H. Ceze, “Shared-Memory Multiprocessors”, Encyclopedia of Parallel Computing, 2011, 1810–1812, Springer US, Boston, MA | DOI

[21] B. A. Nayfeh, K. Olukotun, “A Single-chip Multiprocessor”, Computer, 30:9 (1997), 79–85 | DOI

[22] A. Bardine, et al., “NUMA Caches”, Encyclopedia of Parallel Computing, 2011, 1329–1338, Springer US, Boston, MA | DOI

[23] M. Snir, “Distributed-Memory Multiprocessor”, Encyclopedia of Parallel Computing, 2011, 574–578, Springer US, Boston, MA

[24] G. F. Pfister, In Search of Clusters, 2, Prentice Hall, Upper Saddle River, NJ, 1998, 575 pp.

[25] Beowulf Cluster Computing with Linux, ed. T. L. Sterling, MIT Press, Cambridge, London, 2002, 496 pp.

[26] J. D. Owens, et al., “GPU Computing”, Proceedings of the IEEE, 96:5 (2008), 879–899 | DOI

[27] C. Rochange, S. Uhrig, P. Sainrat, “Memory Hierarchy”, Time-Predictable Architectures, 2014, 69–104, John Wiley Sons, Inc., Hoboken, NJ, USA | DOI

[28] J. L. Hennessy, D. A. Patterson, Computer Architecture: A Quantitative Approach, Computer, 5, Morgan Kaufmann, 2011, 856 pp.

[29] J. Bottomley, “Understanding Caching”, Linux Journal, 2004, no. 117, 58–62

[30] K. Wu, et al., “Early Evaluation of Intel Optane Non-Volatile Memory with HPC IO Workloads”, arXiv, 2017, 1–6

[31] C. T. Yang, C. L. Huang, C. F. Lin, “Hybrid CUDA, OpenMP, and MPI Parallel Programming on Multicore GPU Clusters”, Computer Physics Communications, 182:1 (2011), 266–269, North-Holland | DOI

[32] G. Bilardi, A. Pietracaprina, “Models of Computation, Theoretical”, Encyclopedia of Parallel Computing, 2011, 1150–1158, Springer US, Boston, MA | DOI

[33] D. B. Skillicorn, Parallelism and the Bird-Meertens Formalism, Kingston, Canada, 1992, 16 pp.

[34] G. Bilardi, A. Pietracaprina, G. Pucci, “A Quantitative Measure of Portability with Application to Bandwidth-Latency Models for Parallel Computing”, Euro-Par’99 Parallel Processing. Euro-Par 1999. Lecture Notes in Computer Science (Springer, Berlin, Heidelberg, 1999), 1999, 543–551 | DOI

[35] A. Grama, et al., “Architecture Independent Analysis of Parallel Programs”, Proceedings of the International Conference on Computational Science — ICCS-01. Part II. Lecture Notes in Computer Science (Berlin, Heidelberg), 2001, 599–608 | DOI

[36] J. F. JaJa, “PRAM (Parallel Random Access Machines)”, Encyclopedia of Parallel Computing, 2011, 1608–1615, Springer US, Boston, MA | DOI

[37] L. M. Goldschlager, “A Unified Approach to Models of Synchronous Parallel Machines”, Proceedings of the Tenth Annual ACM Symposium on Theory of Computing — STOC-78 (New York, New York, USA), ACM Press, 1978, 89–94 | DOI

[38] R. E. Ladner, M. J. Fischer, “Parallel Prefix Computation”, Journal of the ACM, 27:4 (1980), 831–838 | DOI

[39] J. F. JaJa, An Introduction to Parallel Algorithms, Addison Wesley Publishing Co., Reading, Redwood City, CA, USA, 1992, 576 pp.

[40] F. Darema, et al., “A Single-Program-Multiple-Data Computational Model for EPEX FORTRAN”, Parallel Computing, 7:1 (1988), 11–24 | DOI

[41] F. Darema, “SPMD Computational Model”, Encyclopedia of Parallel Computing, 2011, 1933–1943, Springer US, Boston, MA | DOI

[42] S. Cook, C. Dwork, R. Reischuk, “Upper and Lower Time Bounds for Parallel Random Access Machines without Simultaneous Writes”, SIAM Journal on Computing, 15:1 (1986), 87–97, Society for Industrial and Applied Mathematics | DOI

[43] R. M. Karp, V. Ramachandran, “Parallel Algorithms for Shared-Memory Machines”, Handbook of theoretical computer science. Volume A: Algorithms and Complexity, ed. J. Van Leeuwen, Elsevier, Amsterdam, 1990, 871–941

[44] N. Pippenger, “On Simultaneous Resource Bounds”, 20th Annual Symposium on Foundations of Computer Science (SFCS 1979) (San Juan, Puerto Rico), IEEE, 1979, 307–311 | DOI

[45] N. Pippenger, “Pebbling with an Auxiliary Pushdown”, Journal of Computer and System Sciences, 23:2 (1981), 151–165, Academic Press | DOI

[46] L. Snyder, “Type Architectures, Shared Memory, and the Corollary of Modest Potential”, Annual Review of Computer Science, 1:1 (1986), 289–317 | DOI

[47] K. Mehlhorn, U. Vishkin, “Randomized and Deterministic Simulations of PRAMs by Parallel Machines with Restricted Granularity of Parallel Memories”, Acta Informatica, 21:4 (1984), 339–374, Springer-Verlag | DOI

[48] P. B. Gibbons, Y. Matias, V. Ramachandran, “The Queue-Read Queue-Write PRAM Model: Accounting for Contention in Parallel Algorithms”, SIAM Journal on Computing, 28:2 (1998), 733–769 | DOI

[49] P. B. Gibbons, Y. Matias, “Can a Shared-Memory Model Serve as a Bridging Model for Parallel Computation?”, Theory of Computing Systems, 32:3 (1999), 327–359 | DOI

[50] A. Aggarwal, A. K. Chandra, M. Snir, “On Communication Latency in PRAM Computations”, Proceedings of the First Annual ACM Symposium on Parallel Algorithms and Architectures — SPAA’89 (New York), ACM Press, New York, USA, 1989, 11–21 | DOI

[51] Y. Mansour, N. Nisan, U. Vishkin, “Trade-offs between Communication Throughput and Parallel Time”, Journal of Complexity, 15:1 (1999), 148–166, Academic Press | DOI

[52] R. Cole, O. Zajicek, “The APRAM: Incorporating Asynchrony into the PRAM Model”, Proceedings of the First Annual ACM Symposium on Parallel Algorithms and Architectures–SPAA-89 (New York), ACM Press, New York, USA, 1989, 169–178 | DOI

[53] P. B. Gibbons, “A More Practical PRAM Model”, Proceedings of the First Annual ACM Symposium on Parallel Algorithms and Architectures — SPAA-89 (New York), ACM Press, New York, USA, 1989, 158–168 | DOI

[54] L. G. Valiant, “General Purpose Parallel Architectures”, Handbook of Theoretical Computer Science (Vol. A): Algorithms and Complexity, 1990, 943–971, Elsevier | DOI

[55] P. Torre, C. P. Kruskal, “Towards a Single Model of Efficient Computation in Real Parallel Machines”, Future Generation Computer Systems, 8:4 (1992), 395–408, North-Holland | DOI

[56] T. Heywood, S. Ranka, “A Practical Hierarchical Model of Parallel Computation I. The model”, Journal of Parallel and Distributed Computing, 16:3 (1992), 212–232, Academic Press | DOI

[57] M. Forsell, “A PRAM-NUMA Model of Computation for Addressing Low-TLP Workloads”, International Journal of Networking and Computing, 1:1 (2011), 21–35, Hiroshima University

[58] A. G. Ranade, “How to Emulate Shared Memory”, Journal of Computer and System Sciences, 42:3 (1991), 307–326, Academic Press | DOI

[59] M. Forsell, et al., “Hardware and Software Support for NUMA Computing on Configurable Emulated Shared Memory Architectures”, Workshops and PhD Forum, 2013 IEEE International Symposium on Parallel Distributed Processing, IEEE, 2013, 640–648 | DOI

[60] M. Forsell, “E–A Language for Thread-Level Parallel Programming on Synchronous Shared Memory NOCs”, WSEAS Transactions on Computers, 3:3 (2004), 807–812

[61] M. Forsell, V. Leppanen, “An Extended PRAM-NUMA Model of Computation for TCF Programming”, International Journal of Networking and Computing, 3:1 (2013), 98–115

[62] A. Aggarwal, et al., “A Model for Hierarchical Memory”, Proceedings of the Nineteenth annual ACM Conference on Theory of Computing, STOC’87 (New York), ACM Press, New York, USA, 1987, 305–314 | DOI

[63] A. Aggarwal, A. K. Chandra, M. Snir, “Hierarchical Memory with Block Transfer”, 28th Annual Symposium on Foundations of Computer Science, SFCS-1987, IEEE, 1987, 204–216 | DOI

[64] F. Luccio, L. Pagli, “A Model of Sequential Computation with Pipelined Access to Memory”, Mathematical Systems Theory, 26:4 (1993), 343–356, Springer-Verlag | DOI

[65] C. A. Mead, L. A. Conway, Introduction to VLSI systems, AddisonWesley, Boston, MA, USA, 1980, 396 pp.

[66] B. Alpern, et al., “The Uniform Memory Hierarchy Model of Computation”, Algorithmica, 12:2 (1994), 72–109, Springer-Verlag | DOI

[67] J. S. Vitter, E. A. Shriver, “Algorithms for parallel memory, II: Hierarchical multilevel memories”, Algorithmica, 12:2 (1994), 148–169, Springer-Verlag | DOI

[68] A. Tiskin, “BSP (Bulk Synchronous Parallelism)”, Encyclopedia of Parallel Computing, 2011, 192–199, Springer US, Boston, MA | DOI

[69] M. Goudreau, et al., “Towards Efficiency and Portability: Programming with the BSP Model”, Proceedings of the Eighth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA’96 (New York), ACM Press, New York, NY, USA, 1996, 1–12 | DOI

[70] R. H. Bisseling, Parallel Scientific Computation: A Structured Approach using BSP and MPI, Oxford University Press, New York, 2004, 325 pp.

[71] W. F. McColl, “Scalable Computing”, Computer Science Today: Recent Trends and Developments. Lecture Notes in Computer Science, 1000 (1995), 46–61, Springer, Berlin, Heidelberg | DOI

[72] A. Tiskin, “The Bulk-synchronous Parallel Random Access Machine”, Theoretical Computer Science, 196:1 (1998), 109–130 | DOI

[73] W. F. McColl, A. Tiskin, “Memory-Efficient Matrix Multiplication in the BSP Model”, Algorithmica, 24:3 (1999), 3–4, Springer-Verlag | DOI

[74] T. Kielmann, S. Gorlatch, “Bandwidth-Latency Models (BSP, LogP)”, Encyclopedia of Parallel Computing, 2011, 107–112, Springer US, Boston, MA | DOI

[75] A. Alexandrov, et al., “LogGP: Incorporating Long Messages into the LogP Model for Parallel Computation”, Journal of Parallel and Distributed Computing, 44:1 (1997), 71–79 | DOI

[76] T. Kielmann, H. E. Bal, K. Verstoep, “Fast Measurement of LogP Parameters for Message Passing Platforms”, Lecture Notes in Computer Science, Parallel and Distributed Processing. IPDPS 2000, Springer, Berlin, Heidelberg, 2000, 1176–1183 | DOI

[77] W. Gropp, E. Lusk, A. Skjellum, Using MPI: Portable Parallel Programming with the Message-Passing Interface, 2, MIT Press, 1999

[78] W. Gropp, “MPI 3 and Beyond: Why MPI Is Successful and What Challenges It Faces”, Lecture Notes in Computer Science, Recent Advances in the Message Passing Interface. EuroMPI 2012, eds. J. L. Traff, S. Benkner, J. J. Dongarra, Springer, Berlin, Heidelberg, 2012, 1–9 | DOI

[79] T. Touyama, S. Horiguchi, “Parallel Computation Model LogPQ”, Lecture Notes in Computer Science, High Performance Computing. ISHPC 1997., eds. C. Polychronopoulos, K. Joe, A. M. K. Araki, Springer, Berlin, Heidelberg, 1997, 327–334 | DOI

[80] T. Touyama, S. Horiguchi, “Performance Evaluation of Practical Parallel Computation Model LogPQ”, Proceedings of the Fourth International Symposium on Parallel Architectures, Algorithms, and Networks, I-SPAN 99, IEEE Computer Society, Washington, DC, USA, 1999, 216–221 | DOI

[81] J. Palmer, G. L. Steele, “Connection Machine model CM-5 System Overview”, Frontiers’92, the Fourth Symposium on the Frontiers of Massive Parallel Computation (October 19–21, 1992, McLean, Virginia), IEEE Computer Society Press, 1992, 474–483 | DOI

[82] F. Ino, N. Fujimoto, K. Hagihara, “LogGPS: A Parallel Computational Model for Synchronization Analysis”, ACM SIGPLAN Notices, 36:7 (2001), 133–142 | DOI

[83] W. Gropp, et al., “A High-performance, Portable Implementation of the MPI Message Passing Interface Standard”, Parallel Computing, 22:6 (1996), 789–828 | DOI

[84] C. A. Moritz, et al., “LoGPC: Modeling Network Contention in Message-Passing Programs”, ACM SIGMETRICS Performance Evaluation Review, 26:1 (1998), 254–263, ACM Press, New York, USA | DOI

[85] C. A. Moritz, M. I. Frank, “LoGPC: Modeling Network Contention in Message-Passing Programs”, IEEE Transactions on Parallel and Distributed Systems, 12:4 (2001), 404–415 | DOI

[86] A. Agarwal, et al., “The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor”, Scalable Shared Memory Multiprocessors. Proceedings of a workshop held (May 26-27, 1990, in Seattle, Wash.), eds. M. Dubois, S. Thakkar, Springer, Boston, MA, 1992, 239–261 | DOI

[87] J. Kubiatowicz, A. Agarwal, “Anatomy of a Message in the Alewife multiprocessor”, ACM International Conference on Supercomputing 25th Anniversary Volume (New York), ACM Press, NY, USA, 2014, 193–204 | DOI

[88] K. W. Cameron, R. Ge, X. H. Sun, “lognP and log3P: Accurate Analytical Models of Point-to-point Communication in Distributed Systems”, IEEE Transactions on Computers, 56:3 (2007), 314–327 | DOI

[89] K. W. Cameron, R. Ge, “Predicting and Evaluating Distributed Communication Performance”, Proceedings of the 2004 ACM IEEE Conference on Supercomputing (IEEE, 2004), 2004, 15 pp. | DOI

[90] K. W. Cameron, X. H. Sun, “Quantifying Locality Effect in Data Access Delay: Memory logP”, Proceedings of the 2003 IEEE International Parallel and Distributed Processing Symposium (IPDPS’03), IEEE Comput. Soc, 2003, 8 pp. | DOI

[91] F. Cappello, et al., “HiHCoHP-Toward a Realistic Communication Model for Hierarchical Hyperclusters of Heterogeneous Processors”, Proceedings 15th International Parallel and Distributed Processing Symposium IEEE Comput. Soc. (IPDPS 2001), IEEE Comput. Soc., 2001, 6 pp. | DOI

[92] F. Cappello, et al., “An Algorithmic Model for Heterogeneous Hyper-Clusters: Rationale and Experience”, International Journal of Foundations of Computer Science., 16:2 (2005), 195–215, World Scientific Publishing Company | DOI

[93] J. L. Bosque, L. Pastor, “A Parallel Computational Model for Heterogeneous Clusters”, IEEE Transactions on Parallel and Distributed Systems, 17:12 (2006), 1390–1400 | DOI

[94] T. Hoefler, et al., “LogfP – a Model for Small Messages in InfiniBand”, Proceedings 20th IEEE International Parallel Distributed Processing Symposium (Washington, DC, USA), IEEE Computer Society, 2006, 319–319 | DOI

[95] T. C. Jepsen, “InfiniBand”, Distributed Storage Networks: Architecture, Protocols and Management, 2013, 159–174, John Wiley Sons, Chichester, West Sussex, England | DOI

[96] W. Nasri, O. Tarhouni, N. Slimi, “PLP: Towards a Realistic and Accurate Model for Communication Performances on Hierarchical Cluster-based Systems”, 2008 IEEE International Symposium on Parallel and Distributed Processing (IEEE, 2008), 2008, 1–8 | DOI

[97] T. Hoefler, T. Schneider, A. Lumsdaine, “LogGOPSim – Simulating Large-Scale Applications in the LogGOPS Model”, Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing (HPDC’10, New York), ACM Press, New York, USA, 2010, 597–604 | DOI

[98] L. G. Valiant, “A Bridging Model for Multi-core Computing”, Journal of Computer and System Sciences, 77:1 (2011), 154–166, Elsevier Inc. | DOI

[99] B. Tu, et al., “Performance Analysis and Optimization of MPI Collective Operations on Multicore Clusters”, The Journal of Supercomputing, 60:1 (2012), 141–162, Springer US | DOI

[100] B. Tu, et al., “Accurate Analytical Models for Message Passing on Multi-core Clusters”, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing, IEEE, 2009, 133–139 | DOI

[101] T. Sterling, et al., “SLOWER: A Performance Model for Exascale Computing”, Supercomputing Frontiers and Innovations, 1:2 (2014), 42–57 | DOI

[102] A. Gerbessiotis, “V. Extending the BSP Model for Multi-core and Out-of-core Computing: MBSP”, Parallel Computing, 41 (2015), 90–102, Elsevier B.V. | DOI

[103] M. Amaris, et al., “A Simple BSP-based Model to Predict Execution Time in GPU Applications”, 2015 IEEE 22nd International Conference on High Performance Computing (HiPC), IEEE, 2015, 285–294 | DOI

[104] B. M. Maggs, L. R. Matheson, R. E. Tarjan, “Models of Parallel Computation: a Survey and Synthesis”, Proceedings of the Twenty-Eighth Hawaii International Conference on System Sciences, IEEE Comput. Soc. Press, 1995, 61–70 | DOI

[105] J. A. Rico-Gallego, et al., “A Survey of Communication Performance Models for High-Performance Computing”, ACM Computing Surveys, 51:6 (2019), 1–36, ACM | DOI

Parcourir par

Geodesic

Parcourir par