Data Streams as Random Permutations: the Distinct Element Problem
Discrete mathematics & theoretical computer science, DMTCS Proceedings vol. AQ, 23rd Intern. Meeting on Probabilistic, Combinatorial, and Asymptotic Methods for the Analysis of Algorithms (AofA'12), DMTCS Proceedings vol. AQ, 23rd Intern. Meeting on Probabilistic, Combinatorial, and Asymptotic Methods for the Analysis of Algorithms (AofA'12) (2012).

Voir la notice de l'article provenant de la source Episciences

In this paper, we show that data streams can sometimes usefully be studied as random permutations. This simple observation allows a wealth of classical and recent results from combinatorics to be recycled, with minimal effort, as estimators for various statistics over data streams. We illustrate this by introducing RECORDINALITY, an algorithm which estimates the number of distinct elements in a stream by counting the number of $k$-records occurring in it. The algorithm has a score of interesting properties, such as providing a random sample of the set underlying the stream. To the best of our knowledge, a modified version of RECORDINALITY is the first cardinality estimation algorithm which, in the random-order model, uses neither sampling nor hashing.
@article{DMTCS_2012_special_262_a23,
     author = {Helmi, Ahmed and Lumbroso, J\'er\'emie and Mart{\'\i}nez, Conrado and Viola, Alfredo},
     title = {Data {Streams} as {Random} {Permutations:} the {Distinct} {Element} {Problem}},
     journal = {Discrete mathematics & theoretical computer science},
     publisher = {mathdoc},
     volume = {DMTCS Proceedings vol. AQ, 23rd Intern. Meeting on Probabilistic, Combinatorial, and Asymptotic Methods for the Analysis of Algorithms (AofA'12)},
     year = {2012},
     doi = {10.46298/dmtcs.3002},
     language = {en},
     url = {http://geodesic.mathdoc.fr/articles/10.46298/dmtcs.3002/}
}
TY  - JOUR
AU  - Helmi, Ahmed
AU  - Lumbroso, Jérémie
AU  - Martínez, Conrado
AU  - Viola, Alfredo
TI  - Data Streams as Random Permutations: the Distinct Element Problem
JO  - Discrete mathematics & theoretical computer science
PY  - 2012
VL  - DMTCS Proceedings vol. AQ, 23rd Intern. Meeting on Probabilistic, Combinatorial, and Asymptotic Methods for the Analysis of Algorithms (AofA'12)
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/articles/10.46298/dmtcs.3002/
DO  - 10.46298/dmtcs.3002
LA  - en
ID  - DMTCS_2012_special_262_a23
ER  - 
%0 Journal Article
%A Helmi, Ahmed
%A Lumbroso, Jérémie
%A Martínez, Conrado
%A Viola, Alfredo
%T Data Streams as Random Permutations: the Distinct Element Problem
%J Discrete mathematics & theoretical computer science
%D 2012
%V DMTCS Proceedings vol. AQ, 23rd Intern. Meeting on Probabilistic, Combinatorial, and Asymptotic Methods for the Analysis of Algorithms (AofA'12)
%I mathdoc
%U http://geodesic.mathdoc.fr/articles/10.46298/dmtcs.3002/
%R 10.46298/dmtcs.3002
%G en
%F DMTCS_2012_special_262_a23
Helmi, Ahmed; Lumbroso, Jérémie; Martínez, Conrado; Viola, Alfredo. Data Streams as Random Permutations: the Distinct Element Problem. Discrete mathematics & theoretical computer science, DMTCS Proceedings vol. AQ, 23rd Intern. Meeting on Probabilistic, Combinatorial, and Asymptotic Methods for the Analysis of Algorithms (AofA'12), DMTCS Proceedings vol. AQ, 23rd Intern. Meeting on Probabilistic, Combinatorial, and Asymptotic Methods for the Analysis of Algorithms (AofA'12) (2012). doi : 10.46298/dmtcs.3002. http://geodesic.mathdoc.fr/articles/10.46298/dmtcs.3002/

Cité par Sources :