Search for Extended Repeats in Genomes Based on the Spectral-Analytical Method
Matematičeskaâ biologiâ i bioinformatika, Tome 7 (2012), pp. 476-492.

Voir la notice de l'article provenant de la source Math-Net.Ru

The spectral-analytical approach to identify diverged extended repeats in genomic sequences presented. The method is based on the multi-scaled integral estimation of the similarity of nucleotide sequences in the space of coefficients of expansion of the curves of GC-and GA-content using classical orthogonal bases. Conditions are found for the optimal approximation, providing automatic detection of different types of repeats (direct and inverted and tandem) for the spectral matrix of similarity. The method works equally well on different scales of data. It can detect fragments of segmental duplications, megasatellite blocks in the genome,as well the regions of synteny. It can be used for a detailed study of chromosome fragments (search for diverged fragments with a moderate length of the repeat unit).
@article{MBB_2012_7_a5,
     author = {Anton Pankratov and Maxim Pyatkov and Ruslan Tetuev and Nafisa Nazipova and Florents F. Dedus},
     title = {Search for {Extended} {Repeats} in {Genomes} {Based} on the {Spectral-Analytical} {Method}},
     journal = {Matemati\v{c}eska\^a biologi\^a i bioinformatika},
     pages = {476--492},
     publisher = {mathdoc},
     volume = {7},
     year = {2012},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/MBB_2012_7_a5/}
}
TY  - JOUR
AU  - Anton Pankratov
AU  - Maxim Pyatkov
AU  - Ruslan Tetuev
AU  - Nafisa Nazipova
AU  - Florents F. Dedus
TI  - Search for Extended Repeats in Genomes Based on the Spectral-Analytical Method
JO  - Matematičeskaâ biologiâ i bioinformatika
PY  - 2012
SP  - 476
EP  - 492
VL  - 7
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/MBB_2012_7_a5/
LA  - ru
ID  - MBB_2012_7_a5
ER  - 
%0 Journal Article
%A Anton Pankratov
%A Maxim Pyatkov
%A Ruslan Tetuev
%A Nafisa Nazipova
%A Florents F. Dedus
%T Search for Extended Repeats in Genomes Based on the Spectral-Analytical Method
%J Matematičeskaâ biologiâ i bioinformatika
%D 2012
%P 476-492
%V 7
%I mathdoc
%U http://geodesic.mathdoc.fr/item/MBB_2012_7_a5/
%G ru
%F MBB_2012_7_a5
Anton Pankratov; Maxim Pyatkov; Ruslan Tetuev; Nafisa Nazipova; Florents F. Dedus. Search for Extended Repeats in Genomes Based on the Spectral-Analytical Method. Matematičeskaâ biologiâ i bioinformatika, Tome 7 (2012), pp. 476-492. http://geodesic.mathdoc.fr/item/MBB_2012_7_a5/

[1] Collins F. S., Morgan M., Patrinos A., “The Human Genome Project: lessons from large-scale biology”, Science, 300 (2003), 286–290 <ext-link ext-link-type='doi' href='https://doi.org/10.1126/science.1084564'>10.1126/science.1084564</ext-link>

[2] Podgornaya O. I., Ostromyshenskii D. I., Kuznetsova I. S., Matveev I. V., Komissarov A. S., “Paradoksy organizatsii tsentromera i geterokhromatina”, Tsitologiya, 51:3 (2009), 204–211

[3] Fondon J. W. III, Garner H. R., “Molecular origins of rapid and continuous morphological evolution”, Proc. Nat. Acad. Sci., 101:52 (2004), 18058–18062 <ext-link ext-link-type='doi' href='https://doi.org/10.1073/pnas.0408118101'>10.1073/pnas.0408118101</ext-link>

[4] Lakich D., Kazazian H. H. Jr., Antonarakis S. E., Gitschier J., “Inversions disrupting the factor VIII gene are a common cause of severe haemophilia A”, Nat. Genet., 5 (1993), 236–241 <ext-link ext-link-type='doi' href='https://doi.org/10.1038/ng1193-236'>10.1038/ng1193-236</ext-link>

[5] Emery A. E. H., “Emery-Dreifuss syndrome”, J. Med. Genet., 26 (1989), 637–641 <ext-link ext-link-type='doi' href='https://doi.org/10.1136/jmg.26.10.637'>10.1136/jmg.26.10.637</ext-link>

[6] Small K., Iber J., Warren S. T., “Emerin deletion reveals a common X-chromosome inversion mediated by inverted repeats”, Nat. Genet., 16 (1997), 96–99 <ext-link ext-link-type='doi' href='https://doi.org/10.1038/ng0597-96'>10.1038/ng0597-96</ext-link>

[7] Richards R. I., Holman K., Yu S., Sutherland G. R., “Fragile X syndrome unstable element, P(CCG)N, and other simple tandem repeat sequences are binding-sites for specific nuclear proteins”, Hum. Mol. Genet., 2 (1993), 1429–1435 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/hmg/2.9.1429'>10.1093/hmg/2.9.1429</ext-link>

[8] Sutherland G. R., Richards I. R., “Simple tandem DNA repeats and human genetic disease”, Proc. Natl. Acad. Sci. USA, 92 (1995), 3636–3641 <ext-link ext-link-type='doi' href='https://doi.org/10.1073/pnas.92.9.3636'>10.1073/pnas.92.9.3636</ext-link>

[9] Mitas M., “Trinucleotide repeats associated with human disease”, Nucleic Acids Res., 25 (1997), 2245–2253 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/nar/25.12.2245'>10.1093/nar/25.12.2245</ext-link>

[10] Toth G., Gaspari Z., Jurka J., “Microsatellites in different eukaryotic genomes: survey and analysis”, Genome Res., 10 (2000), 967–981 <ext-link ext-link-type='doi' href='https://doi.org/10.1101/gr.10.7.967'>10.1101/gr.10.7.967</ext-link>

[11] Graur D., Hide W. A., Li W. H., Is the guinea-pig a rodent?, Nature, 351 (1991), 649–652 <ext-link ext-link-type='doi' href='https://doi.org/10.1038/351649a0'>10.1038/351649a0</ext-link>

[12] Saitou N., Nei M., “The neighbor-joining method: a new method for reconstructing phylogenetic trees”, Mol. Biol. Evol., 4 (1987), 406–425

[13] Gidley J. W., “The lagomorphs an independent order”, Science, 36 (1912), 285–286 <ext-link ext-link-type='doi' href='https://doi.org/10.1126/science.36.922.285'>10.1126/science.36.922.285</ext-link>

[14] Volkov V. V., Leontev A. Yu., “Issledovanie simmetrii geneticheskikh tekstov metodom Fure-analiza”, Biopolimery i kletka, 6:6 (1990), 68–72

[15] Benson D., “Fourier method for biosequence analysis”, Nucl. Acid. Res., 18 (1991), 6305–6310 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/nar/18.21.6305'>10.1093/nar/18.21.6305</ext-link>

[16] Lobzin V. V., Chechetkin V. R., “Poryadok i korellyatsiya v genomnykh posledovatelnostyakh DNK. Spektralnyi podkhod”, UFN, 170:1 (2000), 57–81 <ext-link ext-link-type='doi' href='https://doi.org/10.3367/UFNr.0170.200001c.0057'>10.3367/UFNr.0170.200001c.0057</ext-link>

[17] Gibbs A. J., McIntyre G. A., “The diagram, a method for comparing sequences. Its use with amino acid and nucleotide sequences”, Eur. J. Biochem., 16 (1970), 1–11 <ext-link ext-link-type='doi' href='https://doi.org/10.1111/j.1432-1033.1970.tb01046.x'>10.1111/j.1432-1033.1970.tb01046.x</ext-link>

[18] Dedus F. F., Kulikova L. I., Makhortykh S. A., Nazipova N. N., Pankratov A. N., Tetuev R. K., “Analiticheskie metody raspoznavaniya povtoryayuschikhsya struktur v genomakh”, Doklady Akademii Nauk, 411:5 (2006), 599–602 <ext-link ext-link-type='mr-item-id' href='http://mathscinet.ams.org/mathscinet-getitem?mr=2447036'>2447036</ext-link><ext-link ext-link-type='zbl-item-id' href='https://zbmath.org/?q=an:05824776'>05824776</ext-link>

[19] Tetuev R. K., Dedus F. F., Kulikova L. I., Makhortykh S. A., Nazipova N. N., Pankratov A. N., “Recognition of the structural-functional organization of genetic sequences”, Moscow University Computational Mathematics and Cybernetics, 31:2 (2007), 49–53 <ext-link ext-link-type='doi' href='https://doi.org/10.3103/S0278641907020021'>10.3103/S0278641907020021</ext-link><ext-link ext-link-type='zbl-item-id' href='https://zbmath.org/?q=an:1175.92036'>1175.92036</ext-link>

[20] Pankratov A. N., Gorchakov M. A., Dedus F. F., Dolotova N. S., Kulikova L. I., Makhortykh S. A., Nazipova N. N., Novikova D. A., Olshevets M. M., Pyatkov M. I., Rudnev V. R., Tetuev R. K., Filippov V. V., “Spectral Analysis for Identification and Visualization of Repeats in Genetic Sequences”, Pattern Recognition and Image Analysis, 19:4 (2009), 687–692 <ext-link ext-link-type='doi' href='https://doi.org/10.1134/S105466180904018X'>10.1134/S105466180904018X</ext-link><ext-link ext-link-type='mr-item-id' href='http://mathscinet.ams.org/mathscinet-getitem?mr=2508463'>2508463</ext-link>

[21] Tetuev R. K., Nazipova N. N., Pankratov A. N., Dedus F. F., “Poisk megasatellitnykh tandemnykh povtorov v genomakh eukariot po otsenke ostsillyatsii krivykh GC-soderzhaniya”, Matematicheskaya biologiya i bioinformatika, 5:1 (2010), 30–42 (data obrascheniya: 20.04.2012) <ext-link ext-link-type='uri' href='http://www.matbio.org/downloads/Tetuev2010(5_30).pdf'>http://www.matbio.org/downloads/Tetuev2010(5_30).pdf</ext-link>

[22] Nikiforov A. F., Suslov S. K., Uvarov V. B., Klassicheskie ortogonalnye polinomy diskretnoi peremennoi, Nauka, M., 1985 <ext-link ext-link-type='mr-item-id' href='http://mathscinet.ams.org/mathscinet-getitem?mr=806762'>806762</ext-link>

[23] Nikiforov A. F., Skachkov M. V., “Metody vychisleniya q-polinomov”, Matem. modelirovanie, 13:8 (2001), 85–94 <ext-link ext-link-type='mr-item-id' href='http://mathscinet.ams.org/mathscinet-getitem?mr=1902335'>1902335</ext-link><ext-link ext-link-type='zbl-item-id' href='https://zbmath.org/?q=an:0997.65043'>0997.65043</ext-link>

[24] Khemming R. V., Chislennye metody dlya nauchnykh rabotnikov i inzhenerov, Perevod s angl., Nauka, M., 1972, 400 pp. ; Hamming R. W., Numerical methods for scientists and engineers, MC GRAW-HILL BOOK COMPANY, 1962 <ext-link ext-link-type='mr-item-id' href='http://mathscinet.ams.org/mathscinet-getitem?mr=353610'>353610</ext-link><ext-link ext-link-type='mr-item-id' href='http://mathscinet.ams.org/mathscinet-getitem?mr=137279'>137279</ext-link>

[25] Tetuev R. K., Nazipova N. N., “Consensus of repeated region of mouse chromosome 6 containing 60 tandem copies of a complex pattern”, Repbase Reports, 10:5 (2010), 776

[26] Tetuev R. K., Nazipova N. N., Dedus F. F., “Consensus of repeated region of rat chromosome 4 similar to mouse chromosome 6 repeated region, enclosed in the intergenic region between genes Hrh1 and Atg7”, Repbase Reports, 10:8 (2010), 1185

[27] Pyatkov M. I., Filippov V. V., Pankratov A. N., “Consensus of repeated region of rabbit chromosome 17 containing over 15 huge approximate tandem repeats”, Repbase Reports, 12:3 (2012) <ext-link ext-link-type='zbl-item-id' href='https://zbmath.org/?q=an:06113220'>06113220</ext-link>

[28] Tilford C., Kuroda-Kawaguchi T., Skaletsky H., Rozen S., Brown L., Rosenberg M., McPherson J., Wylie K., Sekhon M., Kucaba A., Waterston R., Page D., “A physical map of the human Y chromosome”, Nature, 409 (2001), 943–945 <ext-link ext-link-type='doi' href='https://doi.org/10.1038/35057170'>10.1038/35057170</ext-link>

[29] Cavaillé J., Buiting K., Kiefmann M., Lalande M., Brannan C. I., Horsthemke B., Bachellerie J. P., Brosius J., Hüttenhofer A., “Identification ofbrain-specific andimprinted small nucleolar RNA genes exhibiting an unusual genomic organization”, PNAS, 97:26 (2000), 14311–14316 <ext-link ext-link-type='doi' href='https://doi.org/10.1073/pnas.250426397'>10.1073/pnas.250426397</ext-link>

[30] Jurka J., Kapitonov V. V., Pavlicek A., Klonowski P., Kohany O., Walichiewicz J., “Repbase Update, a database of eukaryotic repetitive elements”, Cytogentic and Genome Research, 110 (2005), 462–467 <ext-link ext-link-type='doi' href='https://doi.org/10.1159/000084979'>10.1159/000084979</ext-link>

[31] Altschul S. F., Gish W., Miller W., Myers E. W., Lipman D. J., “Basic local alignment search tool”, J. Mol. Biol., 215:3 (1990), 403–410

[32] Benson G., “Tandem repeats finder: a program to analyze DNA sequences”, Nucleic Acids Res., 27 (1999), 573–578 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/nar/27.2.573'>10.1093/nar/27.2.573</ext-link>

[33] Kolpakov R., Bana G., Kucherov G., “Mreps: efficient and flexible detection of tandem repeats in DNA”, Nucleic Acid Research, 2003, no. 31, 3672–3678 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/nar/gkg617'>10.1093/nar/gkg617</ext-link>

[34] Ogurtsov A. Y., Roytberg M. A., Shabalina S. A., Kondrashov A. S., “OWEN: aligning long collinear regions of genomes”, Bioinformatics, 18 (2002), 1703–1704 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/bioinformatics/18.12.1703'>10.1093/bioinformatics/18.12.1703</ext-link>

[35] Landau G. M., Schmidt J. P., Sokol D., “An Algorithm for Approximate Tandem Repeats”, Journal of Computational Biology, 8 (2001), 1–18 <ext-link ext-link-type='doi' href='https://doi.org/10.1089/106652701300099038'>10.1089/106652701300099038</ext-link>

[36] Levenshtein V. I., “Binary codes capable of correcting, deletions, insertions and reversals”, Soviet Phys. Dokl., 1966, no. 10, 707–710 <ext-link ext-link-type='mr-item-id' href='http://mathscinet.ams.org/mathscinet-getitem?mr=189928'>189928</ext-link>

[37] Larkin M. A., Blackshields G., Brown N. P., Chenna R., McGettigan P. A., McWilliam H., Valentin F., Wallace I. M., Wilm A., Lopez R., Thompson J. D., Gibson T. J., Higgins D. G., “ClustalW and ClustalX version 2.0”, Bioinformatics, 23 (2007), 2947–2948 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/bioinformatics/btm404'>10.1093/bioinformatics/btm404</ext-link>

[38] Loots G. G., Ovcharenko I., “ECRbase: Database of Evolutionary Conserved Regions, Promoters, and Transcription Factor Binding Sites in Vertebrate Genomes”, Bioinformatics, 23 (2007), 122–124 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/bioinformatics/btl546'>10.1093/bioinformatics/btl546</ext-link>