Methods and tools for organizing the global job queue in the geographically distributed computing system
Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ Vyčislitelʹnaâ matematika i informatika, Tome 6 (2017) no. 4, pp. 28-42 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

The geographically distributed computing infrastructure (DCI) considered in the paper includes high performance computing systems united by communication channels. Computing systems from the DCI are high-performance clusters differing in architecture and performance. Communication channels uniting clusters have different reliability and bandwidth. The considered model of DCI has a decentralized jobs management and dispatching scheme. This scheme implies that at any time malfunction of any computing cluster or a failure in the communication channel can cause cluster’s leaving the DCI. Cluster’s or channel’s troubleshooting means dynamically connecting the cluster to the DCI. The global job queue is organized in this computing infrastructure. Computing jobs have absolute priorities, and high priority job can interrupt low priority running jobs. Jobs from the global queue allocate on idle resources of computing systems. Forming and storing global job queue in conditions of dynamically changing DCI composition needs the reliable information system. The authors reviewed some distributed DBMSs as the basis of this information system. The article outlines the requirements for a distributed information system. The authors conducted a comparative analysis and selected a solution that satisfies the requirements, and designed prototype of the geographically distributed computing infrastructure with the decentralized scheme of jobs dispatching.
Keywords: grid, information system, absolute priorities.
@article{VYURV_2017_6_4_a2,
     author = {A. V. Baranov and A. I. Tikhomirov},
     title = {Methods and tools for organizing the global job queue in the geographically distributed computing system},
     journal = {Vestnik \^U\v{z}no-Uralʹskogo gosudarstvennogo universiteta. Seri\^a Vy\v{c}islitelʹna\^a matematika i informatika},
     pages = {28--42},
     year = {2017},
     volume = {6},
     number = {4},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/VYURV_2017_6_4_a2/}
}
TY  - JOUR
AU  - A. V. Baranov
AU  - A. I. Tikhomirov
TI  - Methods and tools for organizing the global job queue in the geographically distributed computing system
JO  - Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ Vyčislitelʹnaâ matematika i informatika
PY  - 2017
SP  - 28
EP  - 42
VL  - 6
IS  - 4
UR  - http://geodesic.mathdoc.fr/item/VYURV_2017_6_4_a2/
LA  - ru
ID  - VYURV_2017_6_4_a2
ER  - 
%0 Journal Article
%A A. V. Baranov
%A A. I. Tikhomirov
%T Methods and tools for organizing the global job queue in the geographically distributed computing system
%J Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ Vyčislitelʹnaâ matematika i informatika
%D 2017
%P 28-42
%V 6
%N 4
%U http://geodesic.mathdoc.fr/item/VYURV_2017_6_4_a2/
%G ru
%F VYURV_2017_6_4_a2
A. V. Baranov; A. I. Tikhomirov. Methods and tools for organizing the global job queue in the geographically distributed computing system. Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ Vyčislitelʹnaâ matematika i informatika, Tome 6 (2017) no. 4, pp. 28-42. http://geodesic.mathdoc.fr/item/VYURV_2017_6_4_a2/

[1] G. I. Savin, B. M. Shabanov, V. V. Korneev, P. N. Telegin, D. V. Semenov, A. V. Kiselev, A. V. Kuznecov, O. I. Vdovikin, O. S. Aladyshev, A. P. Ovsjannikov, “Creation of Distributed Infrastructure for Supercomputer Applications”, Software Systems, 2008, no. 2, 2–7

[2] V. V. Korneev, D. V. Semenov, P. N. Telegin, B. M. Shabanov, “Resilient Decentralized GRID Resources Control”, Proceedings of Universities. Electronics, 20:1 (2015), 83–90

[3] A. V. Baranov, A. V. Kiselev, V. V. Starichkov, R. P. Ionin, D. S. Lyakhovets, “Comparison of Workload Management Systems from the Point of View of Organizing an Industrial Computing”, Scientific Services and Internet: Search for New Solutions: Proceedings of the International Supercomputing Conference (Novorossiysk, Russia, September, 17-22, 2012), Publishing of Lomonosov Moscow State University, Moscow, 2012, 506–508

[4] A. V. Baranov, A. I. Tihomirov, “Scheduling of Jobs in a Territorially Distributed Computing System with Absolute Priorities”, Computational Technologies, 22:S1 (2017), 4–12

[5] P. S. Berezovskij, V. N. Kovalenko, “Structure and Functionality of the Job Management System for Grid with Non-Clustered Resources”, KIAM Preprints, 2007, no. 67, 1–29

[6] WMS Architecture overview, } {\tt http://egee-jra1-wm.mi.infn.it/egee-jra1-wm/wms.shtml

[7] Internal Architecture 5.14., } {\tt http://www.gridway.org/...

[8] W. Cirne, F. Brasileiro, L. Costa, D. Paranhos, E. Santos-Neto, N. Andrade, “Scheduling in Bag-of-Task Grids: PAUA Case”, 16th Symposium on Computer Architecture and High Performance Computing. (Oct. 2004.), 124–131 | DOI

[9] V. N. Kovalenko, A. V. Orlov, “Metascheduling in GRID and Resource Reservation Protocol”, KIAM Preprints, 2002, no. 1, 1–25

[10] P. Buncic, P. Saiz, A. J. Peters, “The AliEn System, Status and Perspectives”, 2003 Conference for Computing in High-Energy and Nuclear Physics (La Jolla, CA, USA, 24-28 Mar), 2003, MOAT004 } {\tt http://www.slac.stanford.edu/econf/C0303241/proc/papers/MOAT004.PDF

[11] V. V. Toporkov, D. M. Emel’janov, P. A. Potehin, “Job Batch Generation and Scheduling in Distributed Computing Environments”, Bulletin of South Ural State University. Series: Computational Mathematics and Software Engineering, 4:2 (2015), 44–57 | DOI

[12] M. K. Valiev, E. L. Kitaev, M. I. Slepenkov, “LDAP Directory Service as a Tool for Implementation of Distributed Information Systems”, KIAM Preprints, 2000, no. 23, 1–22

[13] C. Kesselman, S. Fitzgerald, I. Foster, S. Tuecke, W. Smith, “A Directory Service for Configuring High-Performance Distributed Computations”, 6th IEEE Symposium on High Performance Distributed Computing, 1997, 365–375 | DOI

[14] A. Loewenstern, Norberg A. DHT Protocol. 2008 } {\tt http://bittorrent.org/...

[15] ClickHouse Reference Manual. 2015 } {\tt https://clickhouse.yandex/...

[16] Elastic Stack and Product Documentation. 2016 } {\tt https://www.elastic.co/...

[17] Programming with Redis. 2016 } {\tt https://redis.io/documentation

[18] A. Prasad, Announcing Docker Compose. 2015. } {\tt https://blog.docker.com/...