OLAPS: Online Load-Balancing in Range-Partitioned Main Memory Database with Approximate Partition Statistics
Computer Science and Information Systems, Tome 15 (2018) no. 2.

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

Modern database systems can achieve high throughput main-memory query execution by being aware of the dynamics of highly parallel hardware. In such systems, data is partitioned into smaller pieces to reach a better parallelism. Unfortunately, data skew is one of the main problems faced during parallel processing in a parallel main memory database. In some data-intensive applications, parallel range queries over a dynamic range partitioned system are important. Continuous insertions/deletions can lead to a very high degree of data skew and consequently a poor performance of parallel range queries. In this paper, we propose an approach for maintaining balanced loads over a set of nodes as in a system of communicating vessels, by migrating tuples between neighboring nodes. These frequent (or even continuous) data transfers inevitably involve dynamic changes in the partition statistics. To avoid the performance degradation typically associated with this dynamism, we provide a solution based on an approximate Partition Statistics Table. The basic idea behind this table is that both clients and nodes may have an imperfect knowledge about the effective load distribution. They can nevertheless locate any data with almost the same efficiency as using exact partition statistics. Furthermore, maintaining load distribution statistics do not require exchanging additional messages as opposed to the cost of efficient solutions from the state-of-art (which requires at least O(logn) messages). We show through intensive experiments that our proposal supports efficient range queries, while simultaneously guaranteeing storage balance even in the presence of numerous concurrent insertions/deletions generating a heavy skewed data distribution.
Keywords: Load balancing, Parallel main memory database systems (MMBD), Range partitioning, Data skew, Range query
@article{CSIS_2018_15_2_a7,
     author = {Djahida Belayadi and Khaled-Walid Hidouci and Ladjel Bellatreche},
     title = {OLAPS: {Online} {Load-Balancing} in {Range-Partitioned} {Main} {Memory} {Database} with {Approximate} {Partition} {Statistics}},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {15},
     number = {2},
     year = {2018},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2018_15_2_a7/}
}
TY  - JOUR
AU  - Djahida Belayadi
AU  - Khaled-Walid Hidouci
AU  - Ladjel Bellatreche
TI  - OLAPS: Online Load-Balancing in Range-Partitioned Main Memory Database with Approximate Partition Statistics
JO  - Computer Science and Information Systems
PY  - 2018
VL  - 15
IS  - 2
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2018_15_2_a7/
ID  - CSIS_2018_15_2_a7
ER  - 
%0 Journal Article
%A Djahida Belayadi
%A Khaled-Walid Hidouci
%A Ladjel Bellatreche
%T OLAPS: Online Load-Balancing in Range-Partitioned Main Memory Database with Approximate Partition Statistics
%J Computer Science and Information Systems
%D 2018
%V 15
%N 2
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2018_15_2_a7/
%F CSIS_2018_15_2_a7
Djahida Belayadi; Khaled-Walid Hidouci; Ladjel Bellatreche. OLAPS: Online Load-Balancing in Range-Partitioned Main Memory Database with Approximate Partition Statistics. Computer Science and Information Systems, Tome 15 (2018) no. 2. http://geodesic.mathdoc.fr/item/CSIS_2018_15_2_a7/