Solution of graph problems by means of the STAR-machine being implemented on GPUs
Prikladnaâ diskretnaâ matematika, no. 3 (2016), pp. 98-115.

Voir la notice de l'article provenant de la source Math-Net.Ru

Efficient algorithms have been developed for the solution of many graph problems. The solution is performed with associative parallel processors. At present there are no widely used associative architectures. Nevertheless, the recent development of the GPUs has made it possible to implement the associative parallel models with no significant efficiency loss. It enables to use the associative algorithms in practice. The implementation of the abstract model of the associative parallel data processor is given (the STAR-machine) with the GPUs by means of the CUDA technology. The performance is being analysed as well as the efficiency of the solution of the graph problems. The test case is the Warshall algorithm for finding the transitive closure of a directed graph. A graph with 5000 nodes was processed by the sequential Warshall algorithm for 884,622 s, by the associative parallel version for 64,454 s (speed-up is 13 times) and by the GPU-adopted associative parallel version for 0,372 s (speed-up is 2 378 times).
Keywords: associative parallel processor, vertical data processing, SIMD, GPU, directed graph, adjacency matrix, transitive closure.
@article{PDM_2016_3_a8,
     author = {T. V. Snytnikova and A. Sh. Nepomniaschaya},
     title = {Solution of graph problems by means of the {STAR-machine} being implemented on {GPUs}},
     journal = {Prikladna\^a diskretna\^a matematika},
     pages = {98--115},
     publisher = {mathdoc},
     number = {3},
     year = {2016},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/PDM_2016_3_a8/}
}
TY  - JOUR
AU  - T. V. Snytnikova
AU  - A. Sh. Nepomniaschaya
TI  - Solution of graph problems by means of the STAR-machine being implemented on GPUs
JO  - Prikladnaâ diskretnaâ matematika
PY  - 2016
SP  - 98
EP  - 115
IS  - 3
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/PDM_2016_3_a8/
LA  - ru
ID  - PDM_2016_3_a8
ER  - 
%0 Journal Article
%A T. V. Snytnikova
%A A. Sh. Nepomniaschaya
%T Solution of graph problems by means of the STAR-machine being implemented on GPUs
%J Prikladnaâ diskretnaâ matematika
%D 2016
%P 98-115
%N 3
%I mathdoc
%U http://geodesic.mathdoc.fr/item/PDM_2016_3_a8/
%G ru
%F PDM_2016_3_a8
T. V. Snytnikova; A. Sh. Nepomniaschaya. Solution of graph problems by means of the STAR-machine being implemented on GPUs. Prikladnaâ diskretnaâ matematika, no. 3 (2016), pp. 98-115. http://geodesic.mathdoc.fr/item/PDM_2016_3_a8/

[1] Potter J. L., Associative Computing: a Programming Paradigm for Massively Parallel Computers, Perseus Publishing, Boston, 1991, 304 pp.

[2] Rudolph J. A., “A production implementation of an associative array processor – STARAN”, AFIPS'72, ACM, N.Y., 1972, 229–241

[3] Foster C. C., Content Addressable Parallel Processors, John Wiley Sons, N.Y., 1976, 233 pp.

[4] Higuchi T., Furuya T., Handa K., et al., “IXM2: a parallel associative processor”, ISCA'91, Proc. 18th Annual Intern. Symp. Computer Architecture, ACM, N.Y., 1991, 22–31

[5] Nepomnyashchaya A. Sh., Vladyko M. A., “Compare of associative computation models”, Programmirovanie, 1997, no. 6, 41–50 (in Russian)

[6] Nepomniaschaya A. Sh., “Language STAR for associative and parallel computation with vertical data processing”, Parallel Computing Technologies, World Scientific, Singapore, 1991, 258–265

[7] Potter J., Baker J., Scott S., et al., “ASC: An Associative-Computing Paradigm”, Computer, 27:11 (1994), 19–25 | DOI

[8] Nepomniaschaya A. Sh., “Basic associative parallel algorithms for vertical processing systems”, Bulletin of the Novosibirsk Computing Center. Ser. Comp. Science, 2009, no. 9, 63–77

[9] Nepomniaschaya A. Sh., “Constructions used in associative parallel algorithms for undirected graphs. Part 1”, Bulletin of the Novosibirsk Computing Center. Ser. Comp. Science, 2013, no. 35, 69–83

[10] Nepomniaschaya A. Sh., “Constructions used in associative parallel algorithms for directed graphs”, LNCS, 9251, 2015, 201–209

[11] Nepomniaschaya A. Sh., “Solution of path problems using associative parallel processors”, Intern. Conf. Parallel and Distributed Systems, IEEE Computer Society Press, Seoul, 1997, 610–617

[12] Nepomniaschaya A. Sh., Dvoskina M. A., “A simple implementation of Dijkstra's shortest path algorithm on associative parallel processors”, Fundamenta Informaticae (IOS Press), 43 (2000), 227–243 | Zbl

[13] Nepomniaschaya A. Sh., “An associative version of the Bellman–Ford algorithm for finding the shortest paths in directed graphs”, LNCS, 2127, 2001, 285–292 | Zbl

[14] Nepomniaschaya A. Sh., “Efficient implementation of Edmonds' algorithm for finding optimum branchings on associative parallel processors”, Proc. 8th Intern. Conf. Parallel and Distributed Systems (ICPADS'01), IEEE Computer Society Press, KyongJu City, 2001, 3–8

[15] Nepomnyashchaya A. Sh., “Compare of Prima–Dijkstra and Kruskal algorithms using associative parallel processor”, Kibernetika i sistemnyy analiz, 2000, no. 2, 19–27 (in Russian)

[16] Nepomniaschaya A. Sh., “Representation of the Gabow algorithm for finding smallest spanning trees with a degree constraint on associative parallel processors”, LNCS, 1123, 1996, 813–817

[17] Nepomniaschaya A. Sh., “Associative parallel algorithms for dynamic edge update of minimum spanning trees”, LNCS, 2763, 2003, 141–150

[18] Nepomniaschaya A. Sh., “Associative parallel algorithm for dynamic reconstructing a minimum spanning tree after deletion of a vertex”, LNCS, 3606, 2005, 151–173

[19] Nepomnyashchaya A. Sh., “An associative parallel algorithm for dynamic update of a minimum spanning tree after adding a new vertex to a graph”, Kibernetika i sistemnyy analiz, 2006, no. 1, 19–31 (in Russian)

[20] Nepomniaschaya A. Sh., “Efficient implementation of the Italiano algorithms for updating the transitive closure on associative parallel processors”, Fundamenta Informaticae (IOS Press), 89:2–3 (2008), 313–329 | MR | Zbl

[21] Nepomniaschaya A. Sh., “Associative version of the Ramalingam decremental algorithm for dynamic updating the single-sink shortest paths subgraph”, LNCS, 5698, 2009, 257–268

[22] Nepomnyashchaya A. Sh., “Associative version of the ramalingam algorithm for dynamic update of the shortest-path subgraph after insertion of a new edge to the graph”, Kibernetika i sistemnyy analiz, 2012, no. 3, 45–57 (in Russian)

[23] Mingxian J., “Associative operations from MASC to GPU”, PDPTA-15, The 21st Intern. Conf. on Parallel and Distributed Processing Techniques and Applications, CSREA Press, Las Vegas, 2015, 388–393

[24] 9th DIMACS Implementation Challenge – Shortest Paths, , 2010 http://www.dis.uniroma1.it/challenge9/download.s

[25] Vysokoproizvoditelnye vychisleniya dlya serverov| Graficheskie protsessory Tesla|NVIDIA, , 2016 http://www.nvidia.ru/object/tesla-server-gpus-ru.html

[26] Harish P., Narayanan P., “Accelerating large graph algorithms on the GPU using CUDA”, LNCS, 4873, 2007, 197–208

[27] Katz G. J., Kider Jr. T., “All-pairs shortest-paths for large graphs on the GPU”, Proc. 23rd ACM SIGGRAPH/EUROGRAPHICS Symp. Graphics Hardware, Eurographics Association, Sarajevo, 2008, 47–55

[28] Lund B., Smith J. W., A Multi-Stage CUDA Kernel for Floyd–Warshall, arXiv: 1001.4108