Classification and recognition of structures of genetic sequences
Izvestiya of Saratov University. Mathematics. Mechanics. Informatics, Tome 19 (2019) no. 3, pp. 338-350.

Voir la notice de l'article provenant de la source Math-Net.Ru

For solving problems of determining the relationships between the properties of organisms and the properties of the corresponding genetic sequences, we proposed a classification of genetic sequences based on numerical indicators of recurrent and $Z$-recurrent shapes, which define the structure of functional relationships of elements in sequences. For numerical indicators of recurrent and $Z$-recurrent shapes, we introduce a method of classification of genetic sequences. We compared a numerical characteristic that generalizes numerical values with a numerical characteristic of recurrent or $Z$-recurrent shapes which determine the structure of a sequence for each sequence of a biological rank considered in the recognition problem, which has a meaningful interpretation in the application area. The problem of recognition is considered from two points of view: when we determine belonging of a sequence to a specific rank of sequences, and when we determine which group of sequences contains the experimental sequence. Basic mathematical difficulties in solving these recognition problems are associated with the search difference in numerical representation of recurrent and $Z$-recurrent shapes of experimental sequences. To overcome these difficulties we created a spectrum of numerical indicators of recurrent and $Z$-recurrent shapes. Classification and recognition of sequences are illustrated by an example with three ranks of genetic codes of organisms, each of them represented by $5$ sequences. $Z$-recurrent shape is introduced to define and extend the classification of sequences and increase the efficiency of recognition methods.
@article{ISU_2019_19_3_a7,
     author = {V. A. Tverdokhlebov and D. A. Kariakin},
     title = {Classification and recognition of structures of genetic sequences},
     journal = {Izvestiya of Saratov University. Mathematics. Mechanics. Informatics},
     pages = {338--350},
     publisher = {mathdoc},
     volume = {19},
     number = {3},
     year = {2019},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/ISU_2019_19_3_a7/}
}
TY  - JOUR
AU  - V. A. Tverdokhlebov
AU  - D. A. Kariakin
TI  - Classification and recognition of structures of genetic sequences
JO  - Izvestiya of Saratov University. Mathematics. Mechanics. Informatics
PY  - 2019
SP  - 338
EP  - 350
VL  - 19
IS  - 3
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/ISU_2019_19_3_a7/
LA  - ru
ID  - ISU_2019_19_3_a7
ER  - 
%0 Journal Article
%A V. A. Tverdokhlebov
%A D. A. Kariakin
%T Classification and recognition of structures of genetic sequences
%J Izvestiya of Saratov University. Mathematics. Mechanics. Informatics
%D 2019
%P 338-350
%V 19
%N 3
%I mathdoc
%U http://geodesic.mathdoc.fr/item/ISU_2019_19_3_a7/
%G ru
%F ISU_2019_19_3_a7
V. A. Tverdokhlebov; D. A. Kariakin. Classification and recognition of structures of genetic sequences. Izvestiya of Saratov University. Mathematics. Mechanics. Informatics, Tome 19 (2019) no. 3, pp. 338-350. http://geodesic.mathdoc.fr/item/ISU_2019_19_3_a7/

[1] Tverdokhlebov V. A., “Geometric Shape Automaton Mappings, Recurrent and $Z$-recurrent Definition Sequences”, Izv. Saratov Univ. (N. S.), Ser. Math. Mech. Inform., 16:2 (2016), 232–241 (in Russian) | DOI | MR | Zbl

[2] Tverdokhlebov V. A., “$Z$-recurrent definition sequences in the tasks of monitoring and diagnosing processes in systems”, Reports of the Academy of Military Sciences, 2016, no. 2 (70), 43–47 (in Russian) | MR

[3] Kariakin D. A., “Analysis of genetic codes by indicators interposition of nucleotides”, Computer Science and Information Technology, Proc. Int. Sci. Conf., Publ. Center “Nauka”, Saratov, 2016, 190–193 (in Russian)

[4] Lewin B., Genes, BINOM, Laboratoriya znanij Publ., M., 2011, 896 pp. (in Russian)

[5] Watson D., Double helix. Memories of the discovery of the structure of DNA, Mir, M., 1969, 152 pp. (in Russian)

[6] Hogeweg P., “The Roots of Bioinformatics in Theoretical Biology”, PLoS. Computational Biology, 7:3 (2011), e1002021 | DOI

[7] Wattam A. R., Abraham D., Dalay O., Disz T. L., Driscoll T., Gabbard J. L., Gillespie J. J., Gough R., Hix D., Kenyon R., Machi D., Mao C., Nordberg E. K., Olson R., Overbeek R., Pusch G. D., Shukla M., Schulman J., Stevens R. L., Sullivan D. E., Vonstein V., Warren A., Will R., Wilson M. J., Yoo H. S., Zhang C., Zhang Y., Sobral B. W., “PATRIC, the bacterial bioinformatics database and analysis resource”, Nucleic Acids Res., 42:D1 (2014), D581–D591 | DOI

[8] Barnett D. W., Garrison E. K., Quinlan A. R., Stromberg M. P., Marth G. T., “BamTools: a C++ API and toolkit for analyzing and managing BAM files”, Bioinformatics, 27:12 (2011), 1691–1692 | DOI

[9] Plieskatt J., Rinaldi G., Brindley P. J., Jia X., Potriquet J., Bethony J., Mulvenna J., “Bioclojure: a functional library for the manipulation of biological sequences”, Bioinformatics, 30:17 (2014), 2537–2539 | DOI

[10] Goto N., Prins P., Nakao M., Bonnal R., Aerts J., Katayama T., “BioRuby: bioinformatics software for the Ruby programming language”, Bioinformatics, 26:20 (2010), 2617–2619 | DOI

[11] de Brevern A. G., Meyniel J. P., Fairhead C., Neuvéglise C., Malpertuy A., “Trends in IT Innovation to Build a Next Generation Bioinformatics Solution to Manage and Analyse Biological Big Data Produced by NGS Technologies”, BioMed Research International, 2015, 904541, 15 pp. | DOI

[12] Schuster S. C., “Next-generation sequencing transforms today's biology”, Nature Methods, 5:1 (2008), 16–18 | DOI

[13] Singer M., Berg P., Genes and genomes, Mir, M., 1998, 391 pp. (in Russian)

[14] Berg J. M., Tymoczko J. L., Stryer L., “DNA, RNA, and the Flow of Genetic Information”, Biochemistry, 5th ed., W. H. Freeman and Company, N. Y., 2002, 1515 pp. | Zbl

[15] NCBI Genome List, , 2017 (accessed 18.12.2018) http://www.ncbi.nlm.nih.gov/genome/browse/