HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm
Discrete mathematics & theoretical computer science, DMTCS Proceedings vol. AH, 2007 Conference on Analysis of Algorithms (AofA 07), DMTCS Proceedings vol. AH, 2007 Conference on Analysis of Algorithms (AofA 07) (2007).

Voir la notice de l'article provenant de la source Episciences

This extended abstract describes and analyses a near-optimal probabilistic algorithm, HYPERLOGLOG, dedicated to estimating the number of \emphdistinct elements (the cardinality) of very large data ensembles. Using an auxiliary memory of m units (typically, "short bytes''), HYPERLOGLOG performs a single pass over the data and produces an estimate of the cardinality such that the relative accuracy (the standard error) is typically about $1.04/\sqrt{m}$. This improves on the best previously known cardinality estimator, LOGLOG, whose accuracy can be matched by consuming only 64% of the original memory. For instance, the new algorithm makes it possible to estimate cardinalities well beyond $10^9$ with a typical accuracy of 2% while using a memory of only 1.5 kilobytes. The algorithm parallelizes optimally and adapts to the sliding window model.
@article{DMTCS_2007_special_253_a27,
     author = {Flajolet, Philippe and Fusy, \'Eric and Gandouet, Olivier and Meunier, Fr\'ed\'eric},
     title = {HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm},
     journal = {Discrete mathematics & theoretical computer science},
     publisher = {mathdoc},
     volume = {DMTCS Proceedings vol. AH, 2007 Conference on Analysis of Algorithms (AofA 07)},
     year = {2007},
     doi = {10.46298/dmtcs.3545},
     language = {en},
     url = {http://geodesic.mathdoc.fr/articles/10.46298/dmtcs.3545/}
}
TY  - JOUR
AU  - Flajolet, Philippe
AU  - Fusy, Éric
AU  - Gandouet, Olivier
AU  - Meunier, Frédéric
TI  - HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm
JO  - Discrete mathematics & theoretical computer science
PY  - 2007
VL  - DMTCS Proceedings vol. AH, 2007 Conference on Analysis of Algorithms (AofA 07)
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/articles/10.46298/dmtcs.3545/
DO  - 10.46298/dmtcs.3545
LA  - en
ID  - DMTCS_2007_special_253_a27
ER  - 
%0 Journal Article
%A Flajolet, Philippe
%A Fusy, Éric
%A Gandouet, Olivier
%A Meunier, Frédéric
%T HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm
%J Discrete mathematics & theoretical computer science
%D 2007
%V DMTCS Proceedings vol. AH, 2007 Conference on Analysis of Algorithms (AofA 07)
%I mathdoc
%U http://geodesic.mathdoc.fr/articles/10.46298/dmtcs.3545/
%R 10.46298/dmtcs.3545
%G en
%F DMTCS_2007_special_253_a27
Flajolet, Philippe; Fusy, Éric; Gandouet, Olivier; Meunier, Frédéric. HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm. Discrete mathematics & theoretical computer science, DMTCS Proceedings vol. AH, 2007 Conference on Analysis of Algorithms (AofA 07), DMTCS Proceedings vol. AH, 2007 Conference on Analysis of Algorithms (AofA 07) (2007). doi : 10.46298/dmtcs.3545. http://geodesic.mathdoc.fr/articles/10.46298/dmtcs.3545/

Cité par Sources :