Features and tools of the Nevod library in solving problems of extracting temporal markers in the text
Problemy fiziki, matematiki i tehniki, no. 4 (2022), pp. 84-92.

Voir la notice de l'article provenant de la source Math-Net.Ru

Theoretical and methodological issues of semantic text analysis in terms of extracting facts are considered. Using the example of solving the problem of extracting temporal markers, the text search method, and its implementation in the Nevod library are presented. The functional completeness of the developed library is analyzed by comparing its capabilities with the tools of one of the leaders in the field of entity recognition – Microsoft.Recognizers.Text.
Keywords: semantic text analysis, automatic text processing, test data sets, pattern-based text search, pattern package, entity recognition, temporal markers, Nevod library, Mathematica computer algebra system.
@article{PFMT_2022_4_a13,
     author = {V. A. Savionok and V. B. Taranchuk},
     title = {Features and tools of the {Nevod} library in solving problems of extracting temporal markers in the text},
     journal = {Problemy fiziki, matematiki i tehniki},
     pages = {84--92},
     publisher = {mathdoc},
     number = {4},
     year = {2022},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/PFMT_2022_4_a13/}
}
TY  - JOUR
AU  - V. A. Savionok
AU  - V. B. Taranchuk
TI  - Features and tools of the Nevod library in solving problems of extracting temporal markers in the text
JO  - Problemy fiziki, matematiki i tehniki
PY  - 2022
SP  - 84
EP  - 92
IS  - 4
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/PFMT_2022_4_a13/
LA  - ru
ID  - PFMT_2022_4_a13
ER  - 
%0 Journal Article
%A V. A. Savionok
%A V. B. Taranchuk
%T Features and tools of the Nevod library in solving problems of extracting temporal markers in the text
%J Problemy fiziki, matematiki i tehniki
%D 2022
%P 84-92
%N 4
%I mathdoc
%U http://geodesic.mathdoc.fr/item/PFMT_2022_4_a13/
%G ru
%F PFMT_2022_4_a13
V. A. Savionok; V. B. Taranchuk. Features and tools of the Nevod library in solving problems of extracting temporal markers in the text. Problemy fiziki, matematiki i tehniki, no. 4 (2022), pp. 84-92. http://geodesic.mathdoc.fr/item/PFMT_2022_4_a13/

[1] O.N. Polovikova, “Analiz sposobov formalizatsii dokumentov dlya vypolneniya semanticheskogo poiska”, Izvestiya Altaiskogo gosudarstvennogo universiteta, 2012, no. 1–2 (73), 101–103

[2] T.V. Batura, A.M. Bakieva, Metody i sistemy avtomaticheskogo referirovaniya tekstov, monografiya, In-t sistem informatiki im. A.P. Ershova SO RAN, IPTs NGU, Novosibirsk, 2019, 110 pp.

[3] V.B. Barakhnin, D.A. Tkachev, “Klasterizatsiya tekstovykh dokumentov na osnove sostavnykh klyuchevykh termov”, Vestnik Novosibirskogo gosudarstvennogo universiteta. Seriya: Informatsionnye tekhnologii, 8:2 (2010), 5–14

[4] S.F. Lipnitskii, “Matematicheskaya model sinteza tekstov na osnove sliyaniya kommunikativnykh fragmentov”, Problemy fiziki, matematiki i tekhniki, 2018, no. 4 (37), 106–110

[5] O.V. Mitrenina, “Mashinnyi perevod”: I.S. Nikolaev, O.V. Mitrenina, Prikladnaya i kompyuternaya lingvistika, Gl. 1, v. 2, ed. T.M. Lando, URSS, M., 2017, 156–189

[6] M.Yu. Bogatyrev, “Izvlechenie faktov iz tekstov estestvennogo yazyka s primeneniem kontseptualnykh grafovykh modelei”, Izvestiya Tulskogo gosudarstvennogo universiteta. Tekhnicheskie nauki, 2016, no. 7–1, 198–208

[7] T.A. Semina, “Analiz tonalnosti teksta: sovremennye podkhody i suschestvuyuschie problemy”, Sotsialnye i gumanitarnye nauki. Otechestvennaya i zarubezhnaya literatura. Seriya 6: Yazykoznanie. Referativnyi zhurnal, 2020, no. 4, 47–63 | MR

[8] Dzh. Dzharratano, G. Raili, Ekspertnye sistemy. Printsipy razrabotki i programmirovanie, 4-e izd., Vilyams, M., 2007, 1152 pp.

[9] V.E. Sachkov, “Analiz problemy chastotnogo perekrytiya slov pri opredelenii tematiki teksta v semanticheskikh vychislitelnykh kompleksakh”, Vestnik sovremennykh issledovanii, 2018, no. 6.1 (21), 486–488 | Zbl

[10] O.K. Listratova, “K probleme analiza edinits resheniya pri vospriyatii formy slova i ego semanticheskoi interpretatsii”, Voprosy sovremennoi filologii v kontekste vzaimodeistviya yazykov i kultur, Materialy IV Mezhdunarodnoi nauchno-prakticheskoi konferentsii, Otv. za vypusk E. A. Stukolova (Orenburg, 26–27 maya 2021 goda), Orenburgskii gosudarstvennyi pedagogicheskii universitet, Orenburg, 2021, 91–95

[11] E.N. Yakovlev, A.S. Starostin, Sistema i sposob sozdaniya i ispolzovaniya polzovatelskikh semanticheskikh slovarei dlya obrabotki polzovatelskogo teksta na estestvennom yazyke, pat. 2 584 457 Rossiiskaya Federatsiya, MPK G06F 17/28, zayavitel obschestvo s ogranichennoi otvetstvennostyu «Abi InfoPoisk», No 2015103467/08, zayavl. 03.02.15, opubl. 20.05.16, Ofitsialnyi byul., No 14, Federalnaya sluzhba po intellektualnoi sobstvennosti, 2016

[12] V.Sh. Rubashkin, D.G. Lakhuti, “Semanticheskii (kontseptualnyi) slovar dlya informatsionnykh tekhnologii: metody formirovaniya i vedeniya slovarya”, Nauchno-tekhnicheskaya informatsiya. Seriya 2: Informatsionnye protsessy i sistemy, 2000, no. 7, 1–9

[13] T.V. Batura, “Semanticheskii analiz i sposoby predstavleniya smysla teksta v kompyuternoi lingvistike”, Programmnye produkty i sistemy, 2016, no. 4, 45–57 | DOI

[14] O.P. Gorkun, “Podkhody k izvlecheniyu ob'ektov i faktov iz nestrukturirovannykh tekstov”, Advanced science, sbornik statei VI Mezhdunarodnoi nauchno-prakticheskoi konferentsii (Penza, 12 yanvarya 2019 goda), «Nauka i Prosveschenie» (IP Gulyaev G. Yu.), Penza, 2019, 70–72

[15] Semanticheskii analiz dlya avtomaticheskoi obrabotki estestvennogo yazyka, , Nauchno-tekhnicheskii tsentr FGUP «GRChTs» (NTTs), 2021 (Data dostupa: 20.04.2022) https://rdc.grfc.ru/2021/09/semantic_analysis/

[16] L.V. Naikhanova, “Osnovnye tipy semanticheskikh otnoshenii mezhdu terminami predmetnoi oblasti”, Izvestiya vysshikh uchebnykh zavedenii. Povolzhskii region. Tekhnicheskie nauki, 2008, no. 1 (5), 62–71

[17] E.A. Suleimanova, “Semanticheskii analiz kontekstnykh dat”, Programmnye sistemy: teoriya i prilozheniya, 6:4 (27) (2015), 367–399

[18] A.L. Butov, A.T. Mirgaleev, “Metod i algoritmy izvlecheniya faktov v informatsionno-analiticheskikh sistemakh”, sb. nauchn. trudov, Innovatsii v informatsionno-analiticheskikh sistemakh, 2, Naukom, Kursk, 2013, 36–52

[19] V.V. Garshina, V.E. Panin, I.V. Korotkikh, “Razrabotka kontekstno svobodnykh grammatik s ispolzovaniem Tomita-Parsera dlya zadach izvlechenie faktov iz nestrukturirovannykh tekstov”, Informatika: problemy, metodologiya, tekhnologii, Sbornik materialov XIX mezhdunarodnoi nauchno-metodicheskoi konferentsii (Voronezh, 14–15 fevralya 2019 goda), ed. D.N. Borisov, Izdatelstvo «Nauchno-issledovatelskie publikatsii» (OOO «Velborn»), Voronezh, 2019, 1447–1452 | Zbl

[20] Microsoft.Recognizers.Text provides recognition and resolution of numbers, units, and date / time expressed in multiple languages, , 2022 (Date of access: 15.04.2022) https://git-hub.com/microsoft/Recognizers-Text

[21] Tomita-parser - Tekhnologii Yandeksa, (Data dostupa: 15.04.2022) https://yandex.ru/dev/tomita

[22] RCO Fact Extractor SDK | RCO, , 2022 (Data dostupa: 15.04.2022) http://www.rco.ru/?page_id=3554

[23] S.M. Shieber, “Evidence against the contextfreeness of natural language”, Studies in Linguistics and Philosophy, 8:3 (1985), 333–343 | DOI

[24] D.A. Surkov, K.A. Surkov, Yu.M. Chetyrko, I.V. Shimko, V.A. Savenok, Sposob poiska v tekste sovpadenii s shablonami, pat. 037156 Resp. Belarus, MPK G06F 17/27, G06F 17/24, zayavitel obschestvo s ogranichennoi otvetstvennostyu «Nezabudka Softver», No 201800581, zayavl. 24.09.18, opubl. 31.03.20, Ofitsialnyi byul., Evraziiskaya patentnaya organizatsiya, No 2, 2021

[25] Nevod is a language and technology for pattern-based text search, (Date of access: 15.04.2022) https://github.com/nezaboodka/nevod

[26] Solves basic Russian NLP tasks, API for lower level Natasha projects, (Date of access: 16.04.2022) https://github.com/natasha/natasha

[27] Stanford CoreNLP: A Java suite of core NLP tools, , 2022 (Date of access: 16.04.2022) https://github.com/stanfordnlp/CoreNLP

[28] Official page for Language Server Protocol, (Date of access: 23.04.2022) https://microsoft.github.io/language-server-protocol

[29] Nevod language extension for VS Code, (Date of access: 23.04.2022) https://github.com/nezaboodka/nevod-vscode

[30] Nevod Basic Patterns, (Date of access: 23.04.2022) https://github.com/nezaboodka/nevod-patterns

[31] Intelligent Virtual Agents and Bots | Microsoft Power Virtual Agents, (Date of access: 15.04.2022) https://powervirtualagents.microsoft.com/en-us/

[32] Recognizers Test Cases Specs for Date Extractor, (Date of access: 18.04.2022) https://github.com/microsoft/Recognizers-Text/blob/master/Specs/DateTime/English/DateExtractor.json

[33] Yu.A. Selezneva, Nabor teksta na PK: Slepoi desyatipaltsevyi metod pechati, Korona Print, SPb., 2005, 64 pp.

[34] Wolfram Language System Documentation Center, (Date of access: 25.04.2022) https://reference.wolfram.com/language/