Typological approaches to recognizing genus and subgenus of coronaviruses by structural and non-structural genes
Matematičeskaâ biologiâ i bioinformatika, Tome 19 (2024), pp. 593-606.

Voir la notice de l'article provenant de la source Math-Net.Ru

Owing to rapid growth of data on viral genomes in the result of metagenomic researches, bioinformatics and virology are increasingly interacting. There is even the term viral informatics, implying the existence of a whole complex of the databases, knowledge databases about the viruses and software tools for working with them. Among the problems of bioinformatics in virology, it was earlier pointed out to annotation of viral genomes. In the present work on the example of recognizing of subgenus and genus of the coronaviruses a fairly simple and effective typological approach to virus annotation is proposed which uses frequency characteristics of the codons in individual genes. Typological approach is characterized by averaging known data, in particular, such codon frequency characteristics, to determine the similarity or resemblance with them of analogical characteristics for object under consideration. Recognition of subgenus and genus is based on statistics that reveals deviation of coronavirus gene considered from corresponding gene of viral genome with known genus or subgenus. The work compares recognition based on structural genes encoding virion proteins (nucleocapsid protein N and spike protein S) and genes of non-structural proteins combined into a single reading frame ORF1ab. Four typological approaches were discussed in the article. In the first two averaging of all available data and data on prototypical strains only was done over the genera. In the third approach original data on prototype strains were averaged over the subgenera. The fourth approach was based on individual frequency characteristics of prototype strains of the subgenera. Three of the four typological approaches revealed high efficiency in recognizing genus and subgenus of the coronaviruses while using N-gene. The fourth approach proved to be the most effective for identifying genus and subgenus of the coronaviruses. In addition, it has made it possible to reduce the number of codons considered in N-gene of the coronaviruses and to increase recognition efficiency to almost 100%.
@article{MBB_2024_19_a8,
     author = {M. B. Chaley and V. A. Kutyrkin},
     title = {Typological approaches to recognizing genus and subgenus of coronaviruses by structural and non-structural genes},
     journal = {Matemati\v{c}eska\^a biologi\^a i bioinformatika},
     pages = {593--606},
     publisher = {mathdoc},
     volume = {19},
     year = {2024},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/MBB_2024_19_a8/}
}
TY  - JOUR
AU  - M. B. Chaley
AU  - V. A. Kutyrkin
TI  - Typological approaches to recognizing genus and subgenus of coronaviruses by structural and non-structural genes
JO  - Matematičeskaâ biologiâ i bioinformatika
PY  - 2024
SP  - 593
EP  - 606
VL  - 19
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/MBB_2024_19_a8/
LA  - ru
ID  - MBB_2024_19_a8
ER  - 
%0 Journal Article
%A M. B. Chaley
%A V. A. Kutyrkin
%T Typological approaches to recognizing genus and subgenus of coronaviruses by structural and non-structural genes
%J Matematičeskaâ biologiâ i bioinformatika
%D 2024
%P 593-606
%V 19
%I mathdoc
%U http://geodesic.mathdoc.fr/item/MBB_2024_19_a8/
%G ru
%F MBB_2024_19_a8
M. B. Chaley; V. A. Kutyrkin. Typological approaches to recognizing genus and subgenus of coronaviruses by structural and non-structural genes. Matematičeskaâ biologiâ i bioinformatika, Tome 19 (2024), pp. 593-606. http://geodesic.mathdoc.fr/item/MBB_2024_19_a8/

[1] U. F. Greber, R. Bartenschlager, “Editorial: An expanded view of viruses”, FEMS Microbiol Rev, 41:1 (2017), 1–4 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/femsre/fuw044'>10.1093/femsre/fuw044</ext-link>

[2] D. K. Lvov, “Rozhdenie i razvitie virusologii istoriya izucheniya novykh i vozvraschayuschikhsya virusnykh infektsii”, Voprosy virusologii, 57:1S (2012), 5–20

[3] O. O. Koyuncu, I. B. Hogue, L. W. Enquist, “Virus infections in the nervous system”, Cell Host Microbe, 13:4 (2013), 379–393 <ext-link ext-link-type='doi' href='https://doi.org/10.1016/j.chom.2013.03.010'>10.1016/j.chom.2013.03.010</ext-link>

[4] J. T. Schiller, D. R. Lowy, “An introduction to virus infections and human cancer”, Recent Results Cancer Res, 217 (2021), 1–11 <ext-link ext-link-type='doi' href='https://doi.org/10.1007/978-3-030-57362-1_1'>10.1007/978-3-030-57362-1_1</ext-link>

[5] D. A. Jackson, R. H. Symons, P. Berg, “Biochemical method for inserting new genetic information into DNA of Simian Virus 40: Circular SV40 DNA molecules containing Lambda phage genes and the galactose operon of Escherichia coli”, Proc. Natl. Acad. Sci. USA, 69:10 (1972), 2904–2909 <ext-link ext-link-type='doi' href='https://doi.org/10.1073/pnas.69.10.2904'>10.1073/pnas.69.10.2904</ext-link>

[6] S. Nagata, H. Taira, A. Hall, L. Johnsrud, M. Streuli, J. Ecsodi, W. Boll, K. Cantell, C. Weissmann, “Synthesis in E. coli of a polypeptide with human leukocyte interferon activity”, Nature, 284:5754 (1980), 316–320 <ext-link ext-link-type='doi' href='https://doi.org/10.1038/284316a0'>10.1038/284316a0</ext-link>

[7] S. R. Aggarwal, “What's fueling the biotech engine-2011 to 2012”, Nature Biotech, 30:12 (2012), 1191–1197 <ext-link ext-link-type='doi' href='https://doi.org/10.1038/nbt.2437'>10.1038/nbt.2437</ext-link>

[8] K. Katoh, J. Rozewicki, K. D. Yamada, “MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization”, Brief. Bioinform, 20:4 (2019), 1160–1166 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/bib/bbx108'>10.1093/bib/bbx108</ext-link>

[9] P. Zhou, X. L. Yang, X. G. Wang, B. Hu, L. Zhang, W. Zhang, H. R. Si, Y. Zhu, B. Li, C. L. Huang et al, “A pneumonia outbreak associated with a new coronavirus of probable bat origin”, Nature, 579:7798 (2020), 270–273 <ext-link ext-link-type='doi' href='https://doi.org/10.1038/s41586-020-2012'>10.1038/s41586-020-2012</ext-link>

[10] M. F. Boni, P. Lemey, X. Jiang, T. T. Lam, B. W. Perry, T. A. Castoe, A. Rambaut, D. L. Robertson, “Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic”, Nat. Microbiol, 5:11 (2020), 1408–1417 <ext-link ext-link-type='doi' href='https://doi.org/10.1038/s41564-020-0771-4'>10.1038/s41564-020-0771-4</ext-link>

[11] A. A. Zayed, J. M. Wainaina, G. Dominguez-Huerta, E. Pelletier, J. Guo, M. Mohssen, F. Tian, A. A. Pratama, B. Bolduc, O. Zablocki et al, “Cryptic and abundant marine viruses at the evolutionary origins of Earth's RNA virome”, Science, 376:6589 (2022), 156–376162 <ext-link ext-link-type='doi' href='https://doi.org/10.1126/science.abm5847'>10.1126/science.abm5847</ext-link>

[12] B. Ibrahim, D. P. McMahon, F. Hufsky, M. Beer, L. Deng, P. L. Mercier, M. Palmarini, V. Thiel, M. Marz, “A new era of virus bioinformatics”, Virus Res, 251 (2018), 86–90 <ext-link ext-link-type='doi' href='https://doi.org/10.1016/j.virusres.2018.05.009'>10.1016/j.virusres.2018.05.009</ext-link>

[13] Y. Lin, Y. Qian, X. Qi, Shen B., “Databases, knowledgebases, and software tools for virus informatics”, Adv. Exp. Med. Biol, 1368 (2022), 1–19 <ext-link ext-link-type='doi' href='https://doi.org/10.1007/978-981 16-8969-7_1'>10.1007/978-981 16-8969-7_1</ext-link>

[14] M. Tan, J. Xia, H. Luo, G. Meng, Z. Zhu, “Applying the digital data and the bioinformatics tools in SARS-CoV-2 research”, Comput. Struct. Biotechnol. J., 21 (2023), 4697–4705 <ext-link ext-link-type='doi' href='https://doi.org/10.1016/j.csbj.2023.09.044'>10.1016/j.csbj.2023.09.044</ext-link>

[15] T. Hu, J. Li, H. Zhou, C. Li, E. C. Holmes, W. Shi, “Bioinformatics resources for SARS CoV-2 discovery and surveillance”, Brief. Bioinform, 22:2 (2021), 631–641 <ext-link ext-link-type='doi' href='https://doi.org/10.1093/bib/bbaa386'>10.1093/bib/bbaa386</ext-link>

[16] F. Vello, F. Filippini, I. Righetto, “Bioinformatics goes viral: I. Databases, phylogenetics and phylodynamics tools for boosting virus research”, Viruses, 16:9 (2024) <ext-link ext-link-type='doi' href='https://doi.org/10.3390/v16091425'>10.3390/v16091425</ext-link>

[17] A. E. Gorbalenya, S. G. Siddell, “Recognizing species as a new focus of virus research”, PLoS Pathog, 17:3 (2021) <ext-link ext-link-type='doi' href='https://doi.org/10.1371/journal.ppat.1009318'>10.1371/journal.ppat.1009318</ext-link>

[18] D. Hoper, C. Wylezich, M. Beer, “Loeffler 4.0: diagnostic metagenomics”, Adv. Virus Res, 99 (2017), 17–37 <ext-link ext-link-type='doi' href='https://doi.org/10.1016/bs.aivir.2017.08.001'>10.1016/bs.aivir.2017.08.001</ext-link>

[19] A. L. Greninger, “A decade of RNA virus metagenomics is (not) enough”, Virus Res, 244 (2018), 218–229 <ext-link ext-link-type='doi' href='https://doi.org/10.1016/j.virusres.2017.10.014'>10.1016/j.virusres.2017.10.014</ext-link>

[20] Y. Z. Zhang, M. Shi, E. C. Holmes, “Using metagenomics to characterize an expanding virosphere”, Cell, 172:6 (2018), 1168–1172 <ext-link ext-link-type='doi' href='https://doi.org/10.1016/j.cell.2018.02.043'>10.1016/j.cell.2018.02.043</ext-link>

[21] M. J. Adams, E. J. Lefkowitz, A. M. King, B. Harrach, R. L. Harrison, N. J. Knowles, A. M. Kropinski, M. Krupovic, J. H. Kuhn, A. R. Mushegian et al, “50 years of the International Committee on Taxonomy of Viruses: progress and prospects”, Arch. Virol, 162:5 (2017), 1441–1446 <ext-link ext-link-type='doi' href='https://doi.org/10.1007/s00705-016-3215-y'>10.1007/s00705-016-3215-y</ext-link>

[22] P. J. Walker, S. G. Siddell, E. J. Lefkowitz, A. R. Mushegian, E. M. Adriaenssens, P. Alfenas-Zerbini, A. J. Davison, D. M. Dempsey, B. E. Dutilh, M. L. Garcia, B. Harrach et al, “Changes to virus taxonomy and to the international code of virus classification and nomenclature ratified by the International Committee on Taxonomy of Viruses (2021)”, Arch. Virol, 166:9 (2021), 2633–2648 <ext-link ext-link-type='doi' href='https://doi.org/10.1007/s00705-021-05156-1'>10.1007/s00705-021-05156-1</ext-link>

[23] A. E. Gorbalenya, M. Krupovic, A. Mushegian, A. M. Kropinski, S. G. Siddell, A. Varsani, M. J. Adams, A. J. Davison, B. E. Dutilh, B. Harrach et al, “The new scope of virus taxonomy: partitioning the virosphere into 15 hierarchical ranks”, Nat. Microbiol, 5:5 (2020), 668–674 <ext-link ext-link-type='doi' href='https://doi.org/10.1038/s41564-020-0709-x'>10.1038/s41564-020-0709-x</ext-link>

[24] GenBank, (accessed 29.11.2024) ; Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW, “GenBank”, Nucleic Acids Res., 41, Database issue (2013), Article No D36-42 <ext-link ext-link-type='uri' href='https://www.ncbi.nlm.nih.gov/genbank'>https://www.ncbi.nlm.nih.gov/genbank</ext-link><ext-link ext-link-type='doi' href='https://doi.org/10.1093/nar/gks1195'>10.1093/nar/gks1195</ext-link>

[25] M. B. Chalei, Zh. S. Tyulko, V. A. Kutyrkin, “Raspoznavanie vidov flavivirusov na osnove kodiruyuschikh posledovatelnostei poliproteinov”, Mat. Biol. Bioinf, 14:2 (2019), 533–542 <ext-link ext-link-type='doi' href='https://doi.org/10.17537/2019.14.533'>10.17537/2019.14.533</ext-link>

[26] M. B. Chalei, V. A. Kutyrkin, “Raspoznavanie roda koronavirusa na osnove prototipnykh shtammov”, Mat. Biol. Bioinf, 17:1 (2022), 10–27 <ext-link ext-link-type='doi' href='https://doi.org/10.17537/2022.17.10'>10.17537/2022.17.10</ext-link>

[27] M. Yu. Schelkanov, A. Yu. Popova, V. G. Dedkov, V. G. Akimkin, V. V. Maleev, “Istoriya izucheniya i sovremennaya klassifikatsiya koronavirusov (Nidovirales: Coronaviridae)”, Infektsiya i immunitet, 10:2 (2020), 221–246 <ext-link ext-link-type='doi' href='https://doi.org/10.15789/2220-7619-HOI-1412'>10.15789/2220-7619-HOI-1412</ext-link>

[28] M. Chaley, V. Kutyrkin, “Optimization of a coronavirus genus recognition procedure based on the N-gene of prototypic strains”, E3S Web of Conferences, 419 (2023) <ext-link ext-link-type='doi' href='https://doi.org/10.1051/e3sconf/202341902010'>10.1051/e3sconf/202341902010</ext-link>