Typological approaches to recognizing genus and subgenus of coronaviruses by structural and non-structural genes
Matematičeskaâ biologiâ i bioinformatika, Tome 19 (2024) no. 2, pp. 593-606

Voir la notice de l'article provenant de la source Math-Net.Ru

Owing to rapid growth of data on viral genomes in the result of metagenomic researches, bioinformatics and virology are increasingly interacting. There is even the term viral informatics, implying the existence of a whole complex of the databases, knowledge databases about the viruses and software tools for working with them. Among the problems of bioinformatics in virology, it was earlier pointed out to annotation of viral genomes. In the present work on the example of recognizing of subgenus and genus of the coronaviruses a fairly simple and effective typological approach to virus annotation is proposed which uses frequency characteristics of the codons in individual genes. Typological approach is characterized by averaging known data, in particular, such codon frequency characteristics, to determine the similarity or resemblance with them of analogical characteristics for object under consideration. Recognition of subgenus and genus is based on statistics that reveals deviation of coronavirus gene considered from corresponding gene of viral genome with known genus or subgenus. The work compares recognition based on structural genes encoding virion proteins (nucleocapsid protein N and spike protein S) and genes of non-structural proteins combined into a single reading frame ORF1ab. Four typological approaches were discussed in the article. In the first two averaging of all available data and data on prototypical strains only was done over the genera. In the third approach original data on prototype strains were averaged over the subgenera. The fourth approach was based on individual frequency characteristics of prototype strains of the subgenera. Three of the four typological approaches revealed high efficiency in recognizing genus and subgenus of the coronaviruses while using N-gene. The fourth approach proved to be the most effective for identifying genus and subgenus of the coronaviruses. In addition, it has made it possible to reduce the number of codons considered in N-gene of the coronaviruses and to increase recognition efficiency to almost 100%.
@article{MBB_2024_19_2_a8,
     author = {M. B. Chaley and V. A. Kutyrkin},
     title = {Typological approaches to recognizing genus and subgenus of coronaviruses by structural and non-structural genes},
     journal = {Matemati\v{c}eska\^a biologi\^a i bioinformatika},
     pages = {593--606},
     publisher = {mathdoc},
     volume = {19},
     number = {2},
     year = {2024},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/MBB_2024_19_2_a8/}
}
TY  - JOUR
AU  - M. B. Chaley
AU  - V. A. Kutyrkin
TI  - Typological approaches to recognizing genus and subgenus of coronaviruses by structural and non-structural genes
JO  - Matematičeskaâ biologiâ i bioinformatika
PY  - 2024
SP  - 593
EP  - 606
VL  - 19
IS  - 2
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/MBB_2024_19_2_a8/
LA  - ru
ID  - MBB_2024_19_2_a8
ER  - 
%0 Journal Article
%A M. B. Chaley
%A V. A. Kutyrkin
%T Typological approaches to recognizing genus and subgenus of coronaviruses by structural and non-structural genes
%J Matematičeskaâ biologiâ i bioinformatika
%D 2024
%P 593-606
%V 19
%N 2
%I mathdoc
%U http://geodesic.mathdoc.fr/item/MBB_2024_19_2_a8/
%G ru
%F MBB_2024_19_2_a8
M. B. Chaley; V. A. Kutyrkin. Typological approaches to recognizing genus and subgenus of coronaviruses by structural and non-structural genes. Matematičeskaâ biologiâ i bioinformatika, Tome 19 (2024) no. 2, pp. 593-606. http://geodesic.mathdoc.fr/item/MBB_2024_19_2_a8/

[1] U. F. Greber, R. Bartenschlager, “Editorial: An expanded view of viruses”, FEMS Microbiol Rev, 41:1 (2017), 1–4 | DOI | DOI

[2] D. K. Lvov, “Rozhdenie i razvitie virusologii istoriya izucheniya novykh i vozvraschayuschikhsya virusnykh infektsii”, Voprosy virusologii, 57:1S (2012), 5–20

[3] O. O. Koyuncu, I. B. Hogue, L. W. Enquist, “Virus infections in the nervous system”, Cell Host Microbe, 13:4 (2013), 379–393 | DOI | DOI

[4] J. T. Schiller, D. R. Lowy, “An introduction to virus infections and human cancer”, Recent Results Cancer Res, 217 (2021), 1–11 | DOI | DOI

[5] D. A. Jackson, R. H. Symons, P. Berg, “Biochemical method for inserting new genetic information into DNA of Simian Virus 40: Circular SV40 DNA molecules containing Lambda phage genes and the galactose operon of Escherichia coli”, Proc. Natl. Acad. Sci. USA, 69:10 (1972), 2904–2909 | DOI | DOI

[6] S. Nagata, H. Taira, A. Hall, L. Johnsrud, M. Streuli, J. Ecsodi, W. Boll, K. Cantell, C. Weissmann, “Synthesis in E. coli of a polypeptide with human leukocyte interferon activity”, Nature, 284:5754 (1980), 316–320 | DOI | DOI

[7] S. R. Aggarwal, “What's fueling the biotech engine-2011 to 2012”, Nature Biotech, 30:12 (2012), 1191–1197 | DOI | DOI

[8] K. Katoh, J. Rozewicki, K. D. Yamada, “MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization”, Brief. Bioinform, 20:4 (2019), 1160–1166 | DOI | DOI

[9] P. Zhou, X. L. Yang, X. G. Wang, B. Hu, L. Zhang, W. Zhang, H. R. Si, Y. Zhu, B. Li, C. L. Huang et al, “A pneumonia outbreak associated with a new coronavirus of probable bat origin”, Nature, 579:7798 (2020), 270–273 | DOI | DOI

[10] M. F. Boni, P. Lemey, X. Jiang, T. T. Lam, B. W. Perry, T. A. Castoe, A. Rambaut, D. L. Robertson, “Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic”, Nat. Microbiol, 5:11 (2020), 1408–1417 | DOI | DOI

[11] A. A. Zayed, J. M. Wainaina, G. Dominguez-Huerta, E. Pelletier, J. Guo, M. Mohssen, F. Tian, A. A. Pratama, B. Bolduc, O. Zablocki et al, “Cryptic and abundant marine viruses at the evolutionary origins of Earth's RNA virome”, Science, 376:6589 (2022), 156–376162 | DOI | DOI

[12] B. Ibrahim, D. P. McMahon, F. Hufsky, M. Beer, L. Deng, P. L. Mercier, M. Palmarini, V. Thiel, M. Marz, “A new era of virus bioinformatics”, Virus Res, 251 (2018), 86–90 | DOI | DOI

[13] Y. Lin, Y. Qian, X. Qi, Shen B., “Databases, knowledgebases, and software tools for virus informatics”, Adv. Exp. Med. Biol, 1368 (2022), 1–19 | DOI | DOI

[14] M. Tan, J. Xia, H. Luo, G. Meng, Z. Zhu, “Applying the digital data and the bioinformatics tools in SARS-CoV-2 research”, Comput. Struct. Biotechnol. J., 21 (2023), 4697–4705 | DOI | DOI

[15] T. Hu, J. Li, H. Zhou, C. Li, E. C. Holmes, W. Shi, “Bioinformatics resources for SARS CoV-2 discovery and surveillance”, Brief. Bioinform, 22:2 (2021), 631–641 | DOI | DOI

[16] F. Vello, F. Filippini, I. Righetto, “Bioinformatics goes viral: I. Databases, phylogenetics and phylodynamics tools for boosting virus research”, Viruses, 16:9 (2024) | DOI | DOI

[17] A. E. Gorbalenya, S. G. Siddell, “Recognizing species as a new focus of virus research”, PLoS Pathog, 17:3 (2021) | DOI | DOI

[18] D. Hoper, C. Wylezich, M. Beer, “Loeffler 4.0: diagnostic metagenomics”, Adv. Virus Res, 99 (2017), 17–37 | DOI | DOI

[19] A. L. Greninger, “A decade of RNA virus metagenomics is (not) enough”, Virus Res, 244 (2018), 218–229 | DOI | DOI

[20] Y. Z. Zhang, M. Shi, E. C. Holmes, “Using metagenomics to characterize an expanding virosphere”, Cell, 172:6 (2018), 1168–1172 | DOI | DOI

[21] M. J. Adams, E. J. Lefkowitz, A. M. King, B. Harrach, R. L. Harrison, N. J. Knowles, A. M. Kropinski, M. Krupovic, J. H. Kuhn, A. R. Mushegian et al, “50 years of the International Committee on Taxonomy of Viruses: progress and prospects”, Arch. Virol, 162:5 (2017), 1441–1446 | DOI | DOI

[22] P. J. Walker, S. G. Siddell, E. J. Lefkowitz, A. R. Mushegian, E. M. Adriaenssens, P. Alfenas-Zerbini, A. J. Davison, D. M. Dempsey, B. E. Dutilh, M. L. Garcia, B. Harrach et al, “Changes to virus taxonomy and to the international code of virus classification and nomenclature ratified by the International Committee on Taxonomy of Viruses (2021)”, Arch. Virol, 166:9 (2021), 2633–2648 | DOI | DOI

[23] A. E. Gorbalenya, M. Krupovic, A. Mushegian, A. M. Kropinski, S. G. Siddell, A. Varsani, M. J. Adams, A. J. Davison, B. E. Dutilh, B. Harrach et al, “The new scope of virus taxonomy: partitioning the virosphere into 15 hierarchical ranks”, Nat. Microbiol, 5:5 (2020), 668–674 | DOI | DOI

[24] GenBank, (accessed 29.11.2024) ; Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW, “GenBank”, Nucleic Acids Res., 41, Database issue (2013), Article No D36-42 https://www.ncbi.nlm.nih.gov/genbank | DOI | DOI

[25] M. B. Chalei, Zh. S. Tyulko, V. A. Kutyrkin, “Raspoznavanie vidov flavivirusov na osnove kodiruyuschikh posledovatelnostei poliproteinov”, Mat. Biol. Bioinf, 14:2 (2019), 533–542 | DOI | DOI

[26] M. B. Chalei, V. A. Kutyrkin, “Raspoznavanie roda koronavirusa na osnove prototipnykh shtammov”, Mat. Biol. Bioinf, 17:1 (2022), 10–27 | DOI | DOI

[27] M. Yu. Schelkanov, A. Yu. Popova, V. G. Dedkov, V. G. Akimkin, V. V. Maleev, “Istoriya izucheniya i sovremennaya klassifikatsiya koronavirusov (Nidovirales: Coronaviridae)”, Infektsiya i immunitet, 10:2 (2020), 221–246 | DOI | DOI

[28] M. Chaley, V. Kutyrkin, “Optimization of a coronavirus genus recognition procedure based on the N-gene of prototypic strains”, E3S Web of Conferences, 419 (2023) | DOI | DOI