Efficient estimation of the cardinality of large data sets
Discrete mathematics & theoretical computer science, DMTCS Proceedings vol. AG, Fourth Colloquium on Mathematics and Computer Science Algorithms, Trees, Combinatorics and Probabilities, DMTCS Proceedings vol. AG, Fourth Colloquium on Mathematics and Computer Science Algorithms, Trees, Combinatorics and Probabilities (2006).

Voir la notice de l'article provenant de la source Episciences

Giroire has recently proposed an algorithm which returns the $\textit{approximate}$ number of distinct elements in a large sequence of words, under strong constraints coming from the analysis of large data bases. His estimation is based on statistical properties of uniform random variables in $[0,1]$. In this note we propose an optimal estimation, using Kullback information and estimation theory.
@article{DMTCS_2006_special_252_a16,
     author = {Chassaing, Philippe and Gerin, Lucas},
     title = {Efficient estimation of the cardinality of large data sets},
     journal = {Discrete mathematics & theoretical computer science},
     publisher = {mathdoc},
     volume = {DMTCS Proceedings vol. AG, Fourth Colloquium on Mathematics and Computer Science Algorithms, Trees, Combinatorics and Probabilities},
     year = {2006},
     doi = {10.46298/dmtcs.3492},
     language = {en},
     url = {http://geodesic.mathdoc.fr/articles/10.46298/dmtcs.3492/}
}
TY  - JOUR
AU  - Chassaing, Philippe
AU  - Gerin, Lucas
TI  - Efficient estimation of the cardinality of large data sets
JO  - Discrete mathematics & theoretical computer science
PY  - 2006
VL  - DMTCS Proceedings vol. AG, Fourth Colloquium on Mathematics and Computer Science Algorithms, Trees, Combinatorics and Probabilities
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/articles/10.46298/dmtcs.3492/
DO  - 10.46298/dmtcs.3492
LA  - en
ID  - DMTCS_2006_special_252_a16
ER  - 
%0 Journal Article
%A Chassaing, Philippe
%A Gerin, Lucas
%T Efficient estimation of the cardinality of large data sets
%J Discrete mathematics & theoretical computer science
%D 2006
%V DMTCS Proceedings vol. AG, Fourth Colloquium on Mathematics and Computer Science Algorithms, Trees, Combinatorics and Probabilities
%I mathdoc
%U http://geodesic.mathdoc.fr/articles/10.46298/dmtcs.3492/
%R 10.46298/dmtcs.3492
%G en
%F DMTCS_2006_special_252_a16
Chassaing, Philippe; Gerin, Lucas. Efficient estimation of the cardinality of large data sets. Discrete mathematics & theoretical computer science, DMTCS Proceedings vol. AG, Fourth Colloquium on Mathematics and Computer Science Algorithms, Trees, Combinatorics and Probabilities, DMTCS Proceedings vol. AG, Fourth Colloquium on Mathematics and Computer Science Algorithms, Trees, Combinatorics and Probabilities (2006). doi : 10.46298/dmtcs.3492. http://geodesic.mathdoc.fr/articles/10.46298/dmtcs.3492/

Cité par Sources :