Search for Extended Repeats in Genomes Based on the Spectral-Analytical Method
Matematičeskaâ biologiâ i bioinformatika, Tome 7 (2012) no. 2, pp. 476-492.

Voir la notice de l'article provenant de la source Math-Net.Ru

The spectral-analytical approach to identify diverged extended repeats in genomic sequences presented. The method is based on the multi-scaled integral estimation of the similarity of nucleotide sequences in the space of coefficients of expansion of the curves of GC-and GA-content using classical orthogonal bases. Conditions are found for the optimal approximation, providing automatic detection of different types of repeats (direct and inverted and tandem) for the spectral matrix of similarity. The method works equally well on different scales of data. It can detect fragments of segmental duplications, megasatellite blocks in the genome,as well the regions of synteny. It can be used for a detailed study of chromosome fragments (search for diverged fragments with a moderate length of the repeat unit).
@article{MBB_2012_7_2_a5,
     author = {Anton Pankratov and Maxim Pyatkov and Ruslan Tetuev and Nafisa Nazipova and Florents F. Dedus},
     title = {Search for {Extended} {Repeats} in {Genomes} {Based} on the {Spectral-Analytical} {Method}},
     journal = {Matemati\v{c}eska\^a biologi\^a i bioinformatika},
     pages = {476--492},
     publisher = {mathdoc},
     volume = {7},
     number = {2},
     year = {2012},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/MBB_2012_7_2_a5/}
}
TY  - JOUR
AU  - Anton Pankratov
AU  - Maxim Pyatkov
AU  - Ruslan Tetuev
AU  - Nafisa Nazipova
AU  - Florents F. Dedus
TI  - Search for Extended Repeats in Genomes Based on the Spectral-Analytical Method
JO  - Matematičeskaâ biologiâ i bioinformatika
PY  - 2012
SP  - 476
EP  - 492
VL  - 7
IS  - 2
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/MBB_2012_7_2_a5/
LA  - ru
ID  - MBB_2012_7_2_a5
ER  - 
%0 Journal Article
%A Anton Pankratov
%A Maxim Pyatkov
%A Ruslan Tetuev
%A Nafisa Nazipova
%A Florents F. Dedus
%T Search for Extended Repeats in Genomes Based on the Spectral-Analytical Method
%J Matematičeskaâ biologiâ i bioinformatika
%D 2012
%P 476-492
%V 7
%N 2
%I mathdoc
%U http://geodesic.mathdoc.fr/item/MBB_2012_7_2_a5/
%G ru
%F MBB_2012_7_2_a5
Anton Pankratov; Maxim Pyatkov; Ruslan Tetuev; Nafisa Nazipova; Florents F. Dedus. Search for Extended Repeats in Genomes Based on the Spectral-Analytical Method. Matematičeskaâ biologiâ i bioinformatika, Tome 7 (2012) no. 2, pp. 476-492. http://geodesic.mathdoc.fr/item/MBB_2012_7_2_a5/

[1] Collins F. S., Morgan M., Patrinos A., “The Human Genome Project: lessons from large-scale biology”, Science, 300 (2003), 286–290 | DOI

[2] Podgornaya O. I., Ostromyshenskii D. I., Kuznetsova I. S., Matveev I. V., Komissarov A. S., “Paradoksy organizatsii tsentromera i geterokhromatina”, Tsitologiya, 51:3 (2009), 204–211

[3] Fondon J. W. III, Garner H. R., “Molecular origins of rapid and continuous morphological evolution”, Proc. Nat. Acad. Sci., 101:52 (2004), 18058–18062 | DOI

[4] Lakich D., Kazazian H. H. Jr., Antonarakis S. E., Gitschier J., “Inversions disrupting the factor VIII gene are a common cause of severe haemophilia A”, Nat. Genet., 5 (1993), 236–241 | DOI

[5] Emery A. E. H., “Emery-Dreifuss syndrome”, J. Med. Genet., 26 (1989), 637–641 | DOI

[6] Small K., Iber J., Warren S. T., “Emerin deletion reveals a common X-chromosome inversion mediated by inverted repeats”, Nat. Genet., 16 (1997), 96–99 | DOI

[7] Richards R. I., Holman K., Yu S., Sutherland G. R., “Fragile X syndrome unstable element, P(CCG)N, and other simple tandem repeat sequences are binding-sites for specific nuclear proteins”, Hum. Mol. Genet., 2 (1993), 1429–1435 | DOI

[8] Sutherland G. R., Richards I. R., “Simple tandem DNA repeats and human genetic disease”, Proc. Natl. Acad. Sci. USA, 92 (1995), 3636–3641 | DOI

[9] Mitas M., “Trinucleotide repeats associated with human disease”, Nucleic Acids Res., 25 (1997), 2245–2253 | DOI

[10] Toth G., Gaspari Z., Jurka J., “Microsatellites in different eukaryotic genomes: survey and analysis”, Genome Res., 10 (2000), 967–981 | DOI

[11] Graur D., Hide W. A., Li W. H., Is the guinea-pig a rodent?, Nature, 351 (1991), 649–652 | DOI

[12] Saitou N., Nei M., “The neighbor-joining method: a new method for reconstructing phylogenetic trees”, Mol. Biol. Evol., 4 (1987), 406–425

[13] Gidley J. W., “The lagomorphs an independent order”, Science, 36 (1912), 285–286 | DOI

[14] Volkov V. V., Leontev A. Yu., “Issledovanie simmetrii geneticheskikh tekstov metodom Fure-analiza”, Biopolimery i kletka, 6:6 (1990), 68–72

[15] Benson D., “Fourier method for biosequence analysis”, Nucl. Acid. Res., 18 (1991), 6305–6310 | DOI

[16] Lobzin V. V., Chechetkin V. R., “Poryadok i korellyatsiya v genomnykh posledovatelnostyakh DNK. Spektralnyi podkhod”, UFN, 170:1 (2000), 57–81 | DOI

[17] Gibbs A. J., McIntyre G. A., “The diagram, a method for comparing sequences. Its use with amino acid and nucleotide sequences”, Eur. J. Biochem., 16 (1970), 1–11 | DOI

[18] Dedus F. F., Kulikova L. I., Makhortykh S. A., Nazipova N. N., Pankratov A. N., Tetuev R. K., “Analiticheskie metody raspoznavaniya povtoryayuschikhsya struktur v genomakh”, Doklady Akademii Nauk, 411:5 (2006), 599–602 | MR | Zbl

[19] Tetuev R. K., Dedus F. F., Kulikova L. I., Makhortykh S. A., Nazipova N. N., Pankratov A. N., “Recognition of the structural-functional organization of genetic sequences”, Moscow University Computational Mathematics and Cybernetics, 31:2 (2007), 49–53 | DOI | Zbl

[20] Pankratov A. N., Gorchakov M. A., Dedus F. F., Dolotova N. S., Kulikova L. I., Makhortykh S. A., Nazipova N. N., Novikova D. A., Olshevets M. M., Pyatkov M. I., Rudnev V. R., Tetuev R. K., Filippov V. V., “Spectral Analysis for Identification and Visualization of Repeats in Genetic Sequences”, Pattern Recognition and Image Analysis, 19:4 (2009), 687–692 | DOI | MR

[21] Tetuev R. K., Nazipova N. N., Pankratov A. N., Dedus F. F., “Poisk megasatellitnykh tandemnykh povtorov v genomakh eukariot po otsenke ostsillyatsii krivykh GC-soderzhaniya”, Matematicheskaya biologiya i bioinformatika, 5:1 (2010), 30–42 (data obrascheniya: 20.04.2012) http://www.matbio.org/downloads/Tetuev2010(5_30).pdf

[22] Nikiforov A. F., Suslov S. K., Uvarov V. B., Klassicheskie ortogonalnye polinomy diskretnoi peremennoi, Nauka, M., 1985 | MR

[23] Nikiforov A. F., Skachkov M. V., “Metody vychisleniya q-polinomov”, Matem. modelirovanie, 13:8 (2001), 85–94 | MR | Zbl

[24] Khemming R. V., Chislennye metody dlya nauchnykh rabotnikov i inzhenerov, Perevod s angl., Nauka, M., 1972, 400 pp. ; Hamming R. W., Numerical methods for scientists and engineers, MC GRAW-HILL BOOK COMPANY, 1962 | MR | MR

[25] Tetuev R. K., Nazipova N. N., “Consensus of repeated region of mouse chromosome 6 containing 60 tandem copies of a complex pattern”, Repbase Reports, 10:5 (2010), 776

[26] Tetuev R. K., Nazipova N. N., Dedus F. F., “Consensus of repeated region of rat chromosome 4 similar to mouse chromosome 6 repeated region, enclosed in the intergenic region between genes Hrh1 and Atg7”, Repbase Reports, 10:8 (2010), 1185

[27] Pyatkov M. I., Filippov V. V., Pankratov A. N., “Consensus of repeated region of rabbit chromosome 17 containing over 15 huge approximate tandem repeats”, Repbase Reports, 12:3 (2012) | Zbl

[28] Tilford C., Kuroda-Kawaguchi T., Skaletsky H., Rozen S., Brown L., Rosenberg M., McPherson J., Wylie K., Sekhon M., Kucaba A., Waterston R., Page D., “A physical map of the human Y chromosome”, Nature, 409 (2001), 943–945 | DOI

[29] Cavaillé J., Buiting K., Kiefmann M., Lalande M., Brannan C. I., Horsthemke B., Bachellerie J. P., Brosius J., Hüttenhofer A., “Identification ofbrain-specific andimprinted small nucleolar RNA genes exhibiting an unusual genomic organization”, PNAS, 97:26 (2000), 14311–14316 | DOI

[30] Jurka J., Kapitonov V. V., Pavlicek A., Klonowski P., Kohany O., Walichiewicz J., “Repbase Update, a database of eukaryotic repetitive elements”, Cytogentic and Genome Research, 110 (2005), 462–467 | DOI

[31] Altschul S. F., Gish W., Miller W., Myers E. W., Lipman D. J., “Basic local alignment search tool”, J. Mol. Biol., 215:3 (1990), 403–410

[32] Benson G., “Tandem repeats finder: a program to analyze DNA sequences”, Nucleic Acids Res., 27 (1999), 573–578 | DOI

[33] Kolpakov R., Bana G., Kucherov G., “Mreps: efficient and flexible detection of tandem repeats in DNA”, Nucleic Acid Research, 2003, no. 31, 3672–3678 | DOI

[34] Ogurtsov A. Y., Roytberg M. A., Shabalina S. A., Kondrashov A. S., “OWEN: aligning long collinear regions of genomes”, Bioinformatics, 18 (2002), 1703–1704 | DOI

[35] Landau G. M., Schmidt J. P., Sokol D., “An Algorithm for Approximate Tandem Repeats”, Journal of Computational Biology, 8 (2001), 1–18 | DOI

[36] Levenshtein V. I., “Binary codes capable of correcting, deletions, insertions and reversals”, Soviet Phys. Dokl., 1966, no. 10, 707–710 | MR

[37] Larkin M. A., Blackshields G., Brown N. P., Chenna R., McGettigan P. A., McWilliam H., Valentin F., Wallace I. M., Wilm A., Lopez R., Thompson J. D., Gibson T. J., Higgins D. G., “ClustalW and ClustalX version 2.0”, Bioinformatics, 23 (2007), 2947–2948 | DOI

[38] Loots G. G., Ovcharenko I., “ECRbase: Database of Evolutionary Conserved Regions, Promoters, and Transcription Factor Binding Sites in Vertebrate Genomes”, Bioinformatics, 23 (2007), 122–124 | DOI