De-duplication on the backup system with information storage in a database
Modelirovanie i analiz informacionnyh sistem, Tome 24 (2017) no. 2, pp. 215-226

Voir la notice de l'article provenant de la source Math-Net.Ru

Prevention of data loss from digital media includes such a process as a backup. It can be done manually by copying data to external media or automated on a schedule by using special software. There are the remote backup systems, when data are saved over the network to the remote repository. Such systems are multi-user and they process large amounts of data. Shared storage can meet files containing the same fragments. The elimination of repeated data is based on the mechanism of de-duplication. It is a method of information compression, when the search of copies is performed in the entire dataset rather than within a single file. The main advantage of using this technology is a significant saving of disk space. However, the mechanism of eliminating repetitive data can significantly reduce the speed of saving and restoring information. This article is devoted to the problem of implementing such a mechanism in the backup system with information storage in a relational database. In this paper we consider an example of implementation of such a system working in two modes: with the de-duplication of data and without it. The article illustrates a class diagram for the development of a client part of application as well as the description of tables and relationships between them in a database that belongs to the backend. The author offers an algorithm of saving data wiht de-duplication, and also gives the results of comparative tests on the speed of the algorithms of saving and restoring information when working with relational database management systems from different manufacturers.
Mots-clés : file, de-duplication
Keywords: data, backup, database.
@article{MAIS_2017_24_2_a6,
     author = {S. M. Taranin},
     title = {De-duplication on the backup system with information storage in a database},
     journal = {Modelirovanie i analiz informacionnyh sistem},
     pages = {215--226},
     publisher = {mathdoc},
     volume = {24},
     number = {2},
     year = {2017},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/MAIS_2017_24_2_a6/}
}
TY  - JOUR
AU  - S. M. Taranin
TI  - De-duplication on the backup system with information storage in a database
JO  - Modelirovanie i analiz informacionnyh sistem
PY  - 2017
SP  - 215
EP  - 226
VL  - 24
IS  - 2
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/MAIS_2017_24_2_a6/
LA  - ru
ID  - MAIS_2017_24_2_a6
ER  - 
%0 Journal Article
%A S. M. Taranin
%T De-duplication on the backup system with information storage in a database
%J Modelirovanie i analiz informacionnyh sistem
%D 2017
%P 215-226
%V 24
%N 2
%I mathdoc
%U http://geodesic.mathdoc.fr/item/MAIS_2017_24_2_a6/
%G ru
%F MAIS_2017_24_2_a6
S. M. Taranin. De-duplication on the backup system with information storage in a database. Modelirovanie i analiz informacionnyh sistem, Tome 24 (2017) no. 2, pp. 215-226. http://geodesic.mathdoc.fr/item/MAIS_2017_24_2_a6/