Thread > on the Shared Memory Multiprocessors
Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie, no. 14 (2012), pp. 141-155 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

It is tradition to assume that computation decomposed in the certain way into several threads is executed on the systems with shared memory (SMP or NUMA) more efficiently than the same computation but decomposed into several processes. In the presented work we hypothesize that this assumption may be false for the computations with big data volumes, mainly by two reasons. Firstly, the support of common shared address space for the treads may introduce substantially more overhead than aggregate expenses on the execution context switching between processes. Secondly, even when computation does not require intensive memory management, the natural limitation for the memory workset description volume stored in TLB results in necessity to frequently renew that translation cache in the case of using threads too. Experiments and their results which prove our hypothesis correctness are described later in the article.
Keywords: shared memory, performance, threads, processes.
@article{VYURU_2012_14_a13,
     author = {M. O. Bakhterev},
     title = {Thread {<<Efficency>>} on the {Shared} {Memory} {Multiprocessors}},
     journal = {Vestnik \^U\v{z}no-Uralʹskogo gosudarstvennogo universiteta. Seri\^a, Matemati\v{c}eskoe modelirovanie i programmirovanie},
     pages = {141--155},
     year = {2012},
     number = {14},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/VYURU_2012_14_a13/}
}
TY  - JOUR
AU  - M. O. Bakhterev
TI  - Thread <> on the Shared Memory Multiprocessors
JO  - Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie
PY  - 2012
SP  - 141
EP  - 155
IS  - 14
UR  - http://geodesic.mathdoc.fr/item/VYURU_2012_14_a13/
LA  - ru
ID  - VYURU_2012_14_a13
ER  - 
%0 Journal Article
%A M. O. Bakhterev
%T Thread <> on the Shared Memory Multiprocessors
%J Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie
%D 2012
%P 141-155
%N 14
%U http://geodesic.mathdoc.fr/item/VYURU_2012_14_a13/
%G ru
%F VYURU_2012_14_a13
M. O. Bakhterev. Thread <> on the Shared Memory Multiprocessors. Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie, no. 14 (2012), pp. 141-155. http://geodesic.mathdoc.fr/item/VYURU_2012_14_a13/

[1] AMD64 Architecture Programmer's Manual, v. 2, System Programming, Advanced Micro Devices, 2011

[2] R. Appleton, “Understanding a Context Switching Benchmark”, Linux J., 1999, no. 57, 1–6

[3] How long does it take to make context switch?, , 2010 (data obrascheniya: 16.10 2012) http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html

[4] M. L. Powell, S. R. Kleiman, S. Barton, D. Shah, D. Stein, M. Weeks, “SunOS Multi-thread Architecture”, Proceedings of the Winter USENIX Conference (Dallas, TX, 1991), 65–80

[5] The SPARC Architecture Manual. Version 8, SPARC International Inc., Menlo Park, CA, 1991

[6] V. Uhlig, U. Dannowski, E. Skoguld, A. Haeberlen, G. Heiser, Performance of Address-Space Multiplexing on the Pentium, 2002

[7] B. Jacob, T. Mudge, “Virtual Memory in Contemporary Microprocessors”, IEEE Micro, 18:4 (1998), 60–75 | DOI

[8] Tanenbaum A., Modern Operating Systems, 3rd Edition, Pearson, New Jersey, 2009 | Zbl

[9] G. Hunt, J. R. Larus, M. Abadi, M. Aiken, P. Barham, M. Fahndrich, C. Hawblitzel, O. Hodson, S. Levi, N. Murphy, B. Steensgaard, D. Tarditi, T. Wobber, Brian D. Zill, An Overview of the Singularity Project, Microsoft Research, 2005

[10] Linux 3.3.4 source code, {\tt switch_mm} function, (accessed 16.10.2012) http://lxr.linux.no/#linux+v3.3.4/arch/x86/include/asm/mmu_context.h#L33

[11] Bakhterev M. O., \it Thread proc benchmark, , 2012 (accessed 16.10.2012) https://github.com/coda/thread-proc

[12] P. Snyder, “tmpfs: A Virtual Memory File System”, Proceedings of the European USENIX Conference (France, Nice, 1990), 241–248

[13] J. Shin, K. Tam, D. Huang, B. Petrick, H. Pham, C. Hwang, H. Li, A. Smith, T. Johnson, F. Schumacher, D. Greenhill, A. Leon, A. Strong, “A 40nm 16-Core 128-Thread CMT SPARC SoC Processor”, ISSCC Digest, 2010, no. 56, 98–99

[14] A. Baumann, S. Peter, A. Schüpbach, A. Singhania, T. Roscoe, P. Barham, R. Isaacs, Your computer is already a distributed system. Why isn't your OS?, Proceedings of the 12th HotOS Workshop (Monte Verità, 2009), 19

[15] Bakhterev M. O., Vasev P. A., Kazantsev A. Y., Albrekht I. A., “RiDE: The Distributed Computation Technique”, Proceedings of PaCT'2011 International Conference (Moscow, 2011), 418–-426