Investigation of Latent Periodicity Phenomenon in the Genomes of Eukaryotic Organisms
Matematičeskaâ biologiâ i bioinformatika, Tome 8 (2013), pp. 480-501.

Voir la notice de l'article provenant de la source Math-Net.Ru

Analysis of the data from the first release of HeteroGenome database, collecting the revealed regions of latent periodicity in the genomes of a number of eukaryotic organisms, is presented. Tandem repeats with different conservation of a pattern copies, including the highly diverged repeats, were identified in the genomes of S. cerevisiae, A. thaliana, C. elegans and D. melanogaster. The data were obtained with the help of original spectral-statistical approach to searching for the reliable regions of latent periodicity in DNA sequences. Introduction of the two-level structure for the data presentation (At the first, nonredundant level the regions of latent periodicity is generally viewed, at the second level only the fragments of conservative periodic structure are considered.) allowed to estimate a share of genome coverage by the regions of latent periodicity which counts $\sim10\%$ of a whole genome length. The estimate is deduced according to the data of the first level. An analysis of quantitative and qualitative content (corresponding to the divergence levels) of the latent periodicity regions over all the chromosomes of the considered organisms revealed the characteristic types of periodicity in a genome of each organism. The histograms showing density distribution of the latent periodicity regions along every chromosome in the analyzed genomes were built. A repertoire of period lengths were revealed in the genomes. Moreover, HeteroGenome base offers some additional possibilities for its’ data analysis and is freely available at URL: http://www.jcbi.ru/lp_baze/.
@article{MBB_2013_8_a5,
     author = {M. B. Chaley and V. A. Kutyrkin and E. I. Teplukhina and G. E. Tyulbasheva and N. N. Nazipova},
     title = {Investigation of {Latent} {Periodicity} {Phenomenon} in the {Genomes} of {Eukaryotic} {Organisms}},
     journal = {Matemati\v{c}eska\^a biologi\^a i bioinformatika},
     pages = {480--501},
     publisher = {mathdoc},
     volume = {8},
     year = {2013},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/MBB_2013_8_a5/}
}
TY  - JOUR
AU  - M. B. Chaley
AU  - V. A. Kutyrkin
AU  - E. I. Teplukhina
AU  - G. E. Tyulbasheva
AU  - N. N. Nazipova
TI  - Investigation of Latent Periodicity Phenomenon in the Genomes of Eukaryotic Organisms
JO  - Matematičeskaâ biologiâ i bioinformatika
PY  - 2013
SP  - 480
EP  - 501
VL  - 8
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/MBB_2013_8_a5/
LA  - ru
ID  - MBB_2013_8_a5
ER  - 
%0 Journal Article
%A M. B. Chaley
%A V. A. Kutyrkin
%A E. I. Teplukhina
%A G. E. Tyulbasheva
%A N. N. Nazipova
%T Investigation of Latent Periodicity Phenomenon in the Genomes of Eukaryotic Organisms
%J Matematičeskaâ biologiâ i bioinformatika
%D 2013
%P 480-501
%V 8
%I mathdoc
%U http://geodesic.mathdoc.fr/item/MBB_2013_8_a5/
%G ru
%F MBB_2013_8_a5
M. B. Chaley; V. A. Kutyrkin; E. I. Teplukhina; G. E. Tyulbasheva; N. N. Nazipova. Investigation of Latent Periodicity Phenomenon in the Genomes of Eukaryotic Organisms. Matematičeskaâ biologiâ i bioinformatika, Tome 8 (2013), pp. 480-501. http://geodesic.mathdoc.fr/item/MBB_2013_8_a5/

[1] Richard G. F., Kerrest A., Dujon B., “Comparative genomics and molecular dynamics of DNA repeats in eukaryotes”, Microbiol. Mol. Biol. Rev., 72 (2008), 686–727 <ext-link ext-link-type='doi' href='https://doi.org/10.1128/MMBR.00011-08'>10.1128/MMBR.00011-08</ext-link>

[2] Kelkar Y. D., Strubczewski N., Hile S. E., Chiaromonte F., Eckert K. A., Makova K. D., “What is a microsatellite: a computational and experimental definition based upon repeat mutational behavior at A/T and GT/AC repeats”, Genome Biol. Evol., 2 (2010), 620–635 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/gbe/evq046'>10.1093/gbe/evq046</ext-link>

[3] Ellegren H., “Microsatellites: simple sequences with complex evolution”, Nat. Rev. Genet., 5 (2004), 435–445 <ext-link ext-link-type='doi' href='https://doi.org/10.1038/nrg1348'>10.1038/nrg1348</ext-link>

[4] Welch J. W., Maloney D. H., Fogel S., “Unequal crossing-over and gene conversion at the amplified CUP1 locus of yeast”, Mol. Gen. Genet., 222 (1990), 304–310 <ext-link ext-link-type='doi' href='https://doi.org/10.1007/BF00633833'>10.1007/BF00633833</ext-link>

[5] Tyler-Smith C., Willard H. F., “Mammalian chromosome structure”, Curr. Opin. Genet. Dev., 3 (1993), 390–397 <ext-link ext-link-type='doi' href='https://doi.org/10.1016/0959-437X(93)90110-B'>10.1016/0959-437X(93)90110-B</ext-link>

[6] Hewett D. R., Handt O., Hobson L., Mangelsdorf M., Eyre H. J., Baker E., Sutherland G. R., Schuffenhauer S., Mao J. I., Richards R. I., “FRA10B structure reveals common elements in repeat expansion and chromosomal fragile site genesis”, Mol. Cell., 1 (1998), 773–781 <ext-link ext-link-type='doi' href='https://doi.org/10.1016/S1097-2765(00)80077-5'>10.1016/S1097-2765(00)80077-5</ext-link>

[7] Yu S., Mangelsdorf M., Hewett D., Hobson L., Baker E., Eyre H. J., Lapsys N., Le Paslier D., Doggett N. A., Sutherland G. R., Richards R. I., “Human chromosomal fragile site FRA16B is an amplified AT-rich minisatellite repeat”, Cell, 88 (1997), 367–374 <ext-link ext-link-type='doi' href='https://doi.org/10.1016/S0092-8674(00)81875-9'>10.1016/S0092-8674(00)81875-9</ext-link>

[8] Fu Y. H., Kuhl D. P., Pizzuti A., Pieretti M., Sutcliffe J. S., Richards S., Verkerk A. J., Holden J. J., Fenwick R. G. Jr, Warren S. T., et al., “Variation of the CGG repeat at the fragile X site results in genetic instability: resolution of the Sherman paradox”, Cell, 67 (1991), 1047–1058 <ext-link ext-link-type='doi' href='https://doi.org/10.1016/0092-8674(91)90283-5'>10.1016/0092-8674(91)90283-5</ext-link>

[9] Liquori C. L., Ricker K., Moseley M. L., Jacobsen J. F., Kress W., Naylor S. L., Day J. W., Ranum L. P., “Myotonic dystrophy type 2 caused by a CCTG expansion in intron 1 of ZNF9”, Science, 293 (2001), 864–867 <ext-link ext-link-type='doi' href='https://doi.org/10.1126/science.1062125'>10.1126/science.1062125</ext-link>

[10] Matsuura T., Fang P., Pearson C. E., Jayakar P., Ashizawa T., Roa B. B., Nelson D. L., Interruptions in the expanded ATTCT repeat of spinocerebellar ataxia type 10: repeat purity as a disease modifier?, Am. J. Hum. Genet., 78 (2006), 125–129 <ext-link ext-link-type='doi' href='https://doi.org/10.1086/498654'>10.1086/498654</ext-link>

[11] Lalioti M. D., Scott H. S., Buresi C., Rossier C., Bottani A., Morris M. A., Malafosse A., Antonarakis S. E., “Dodecamer repeat expansion in cystatin B gene in progressive myoclonus epilepsy”, Nature, 386 (1997), 847–851 <ext-link ext-link-type='doi' href='https://doi.org/10.1038/386847a0'>10.1038/386847a0</ext-link>

[12] Martin P., Makepeace K., Hill S. A., Hood D. W., Moxon E. R., “Microsatellite instability regulates transcription factor binding and gene expression”, Proc. Natl. Acad. Sci. USA, 102 (2005), 3800–3804 <ext-link ext-link-type='doi' href='https://doi.org/10.1073/pnas.0406805102'>10.1073/pnas.0406805102</ext-link>

[13] Benson G., “Tandem repeats finder: a program to analyze DNA sequences”, Nucleic Acids Res., 27 (1999), 573–580 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/nar/27.2.573'>10.1093/nar/27.2.573</ext-link>

[14] Reneker J., Shyu C. R., Zeng P., Polacco J. C., Gassmann W., “ACMES: fast multiple-genome searches for short repeat sequences with concurrent cross-species information retrieval”, Nucleic Acids Res., 32 (2004), W649–W653 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/nar/gkh455'>10.1093/nar/gkh455</ext-link>

[15] Roset R., Subirana J. A., Messeguer X., “MREPATT: detection and analysis of exact consecutive repeats in genomic sequences”, Bioinformatics, 19 (2003), 2475–2476 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/bioinformatics/btg326'>10.1093/bioinformatics/btg326</ext-link>

[16] Parisi V., Fonzo V. D., Aluffi-Pentini F., “STRING: finding tandem repeats in DNA sequences”, Bioinformatics, 19 (2003), 1733–1738 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/bioinformatics/btg268'>10.1093/bioinformatics/btg268</ext-link>

[17] Kolpakov R., Kucherov G., “Mreps: efficient and flexible detection of tandem repeats in DNA”, Nucleic Acids Res., 31 (2003), 3672–3678 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/nar/gkg617'>10.1093/nar/gkg617</ext-link>

[18] Wexler Y., Yakhini Z., Kashi Y., Geiger D., “Finding approximate tandem repeats in genomic sequences”, J. Comput. Biol., 12 (2005), 928–942 <ext-link ext-link-type='doi' href='https://doi.org/10.1089/cmb.2005.12.928'>10.1089/cmb.2005.12.928</ext-link>

[19] Boeva V., Regnier M., Papatsenko D., Makeev V., “Short fuzzy tandem repeats in genomic sequences, identification, and possible role in regulation of gene expression”, Bioinformatics, 22 (2006), 676–684 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/bioinformatics/btk032'>10.1093/bioinformatics/btk032</ext-link>

[20] Mudunuri S. B., Nagarajaram H. A., “IMEx: imperfect microsatellite extractor”, Bioinformatics, 23 (2007), 1181–1187 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/bioinformatics/btm097'>10.1093/bioinformatics/btm097</ext-link>

[21] Pellegrini M., Renda M. E., Vecchio A., “TRStalker: an efficient heuristic for finding fuzzy tandem repeats”, Bioinformatics, 26 (2010), i358–i366 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/bioinformatics/btq209'>10.1093/bioinformatics/btq209</ext-link>

[22] Sokol D., Benson G., Tojeira J., “Tandem repeats over the edit distance”, Bioinformatics, 23 (2007), e30–e35 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/bioinformatics/btl309'>10.1093/bioinformatics/btl309</ext-link>

[23] Sokol D., Atagun F., “TRedD — A database for tandem repeats over the edit distance”, Database, 2010, baq003

[24] Gelfand Y., Rodriguez A., Benson G., “TRDB — the Tandem Repeats Database”, Nucleic Acids Res., 35 (2007), 80–87 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/nar/gkl1013'>10.1093/nar/gkl1013</ext-link>

[25] Boby T., Patch A., Aves S., “TRbase: a database relating tandem repeats to disease genes for the human genome”, Bioinformatics, 21 (2005), 860–921 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/bioinformatics/bti059'>10.1093/bioinformatics/bti059</ext-link>

[26] Chaley M. B., Nazipova N. N., Kutyrkin V. A., “Statistical methods for detecting latent periodicity patterns in biological sequences: the case of small-size samples”, Pattern Recogn. Image Anal., 19 (2009), 358–367 <ext-link ext-link-type='doi' href='https://doi.org/10.1134/S1054661809020217'>10.1134/S1054661809020217</ext-link>

[27] Chalei M. B., Nazipova N. N., Kutyrkin V. A., “Sovmestnoe ispolzovanie razlichnykh kriteriev proverki odnorodnosti dlya vyyavleniya skrytoi periodichnosti v biologicheskikh posledovatelnostyakh”, Mat. biol. i bioinform., 2:1 (2007), 20–35 (data obrascheniya: 28.07.2013) <ext-link ext-link-type='uri' href='http://www.matbio.org/downloads/Chaley2007(2_20).pdf'>http://www.matbio.org/downloads/Chaley2007(2_20).pdf</ext-link>

[28] Chaley M., Kutyrkin V., “Model of perfect tandem repeat with random pattern and empirical homogeneity testing poly-criteria for latent periodicity revelation in biological sequences”, Math. Biosci., 211 (2008), 186–204 <ext-link ext-link-type='doi' href='https://doi.org/10.1016/j.mbs.2007.10.008'>10.1016/j.mbs.2007.10.008</ext-link><ext-link ext-link-type='mr-item-id' href='http://mathscinet.ams.org/mathscinet-getitem?mr=2392420'>2392420</ext-link><ext-link ext-link-type='zbl-item-id' href='https://zbmath.org/?q=an:1130.92022'>1130.92022</ext-link>

[29] Fields S., Johnston M., Cell biology. Whither model organism research?, Science, 307 (2005), 1885–1886 <ext-link ext-link-type='doi' href='https://doi.org/10.1126/science.1108872'>10.1126/science.1108872</ext-link>

[30] “International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome”, Nature, 409 (2001), 860–921 <ext-link ext-link-type='doi' href='https://doi.org/10.1038/35057062'>10.1038/35057062</ext-link>