The Zipf law for random texts with unequal letter probabilities and the Pascal pyramid
Izvestiâ vysših učebnyh zavedenij. Matematika, no. 12 (2012), pp. 30-33.

Voir la notice de l'article provenant de la source Math-Net.Ru

The model of word generation with independent unequal letter probabilities is analyzed in the article. It is proved that the probability $p(r)$ of words of rank $r$ has the power asymptotic behavior. Elementary methods not similar to Conrad and Mitzenmacher ones are used to represent a short proof of the theorem. We derive also an explicit formula of power.
Keywords: Zipf law, monkey model, order statistics, power laws, recursive sequences, functional equations.
Mots-clés : Pascal pyramid
@article{IVM_2012_12_a2,
     author = {V. V. Bochkarev and E. Yu. Lerner},
     title = {The {Zipf} law for random texts with unequal letter probabilities and the {Pascal} pyramid},
     journal = {Izvesti\^a vys\v{s}ih u\v{c}ebnyh zavedenij. Matematika},
     pages = {30--33},
     publisher = {mathdoc},
     number = {12},
     year = {2012},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/IVM_2012_12_a2/}
}
TY  - JOUR
AU  - V. V. Bochkarev
AU  - E. Yu. Lerner
TI  - The Zipf law for random texts with unequal letter probabilities and the Pascal pyramid
JO  - Izvestiâ vysših učebnyh zavedenij. Matematika
PY  - 2012
SP  - 30
EP  - 33
IS  - 12
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/IVM_2012_12_a2/
LA  - ru
ID  - IVM_2012_12_a2
ER  - 
%0 Journal Article
%A V. V. Bochkarev
%A E. Yu. Lerner
%T The Zipf law for random texts with unequal letter probabilities and the Pascal pyramid
%J Izvestiâ vysših učebnyh zavedenij. Matematika
%D 2012
%P 30-33
%N 12
%I mathdoc
%U http://geodesic.mathdoc.fr/item/IVM_2012_12_a2/
%G ru
%F IVM_2012_12_a2
V. V. Bochkarev; E. Yu. Lerner. The Zipf law for random texts with unequal letter probabilities and the Pascal pyramid. Izvestiâ vysših učebnyh zavedenij. Matematika, no. 12 (2012), pp. 30-33. http://geodesic.mathdoc.fr/item/IVM_2012_12_a2/

[1] Michel J. B., Shen Y. K., Aiden A. P., Veres A., Gray M. K., Pickett J. P., Hoiberg D., Clancy D., Norvig P., Orwant J., Pinker S., Nowak M. A., Aiden E. L., “Quantitative analysis of culture using millions of digitized books”, Science, 331:6014 (2011), 176–182 http://www.librarian.net/wp-content/uploads/science-googlelabs.pdf | DOI

[2] Maslov V. P., Maslova T. V., “O zakone Tsipfa i rangovykh raspredeleniyakh v lingvistike i semiotike”, Matem. zametki, 80:5 (2006), 718–732 | MR | Zbl

[3] Mandelbrot B., “An informational theory of the statistical structure of languages”, Communication Theory, ed. W. Jackson, Betterworth, 1953, 486–502

[4] Miller G. A., “Some effects of intermittent silence”, Amer. J. Psychology, 70 (1957), 311–314 | DOI

[5] Li W., “Random texts exhibit Zipf's-law-like word frequency distribution”, IEEE Transactions on Information Theory, 38 (1992), 1842–1845 | DOI

[6] Gusein-Zade S. M., “O raspredelenii bukv russkogo yazyka po chastote vstrechaemosti”, Probl. peredachi inform., 24:4 (1988), 102–107 | MR

[7] Gusein-Zade S. M., “O vstrechaemosti klyuchevykh slov i o drugikh ranzhirovannykh ryadakh v informatike”, Nauchno-tekhnicheskaya informatsiya. Ser. 2, 1987, no. 1, 28–31

[8] Conrad B., Mitzenmacher M., “Power laws for monkeys typing randomly: the case of unequal probabilities”, IEEE Transac., 50 (2004), 1403–1414 http://www.eecs.harvard.edu/~michaelm/postscripts/toit2004a.pdf | MR | Zbl

[9] Bochkarev V. V., Lerner E. Yu., Zipf and non-Zipf laws for homogeneous Markov chain, arXiv: 1207.1872