Object descriptors for linking structural elements of noisy document images
Informacionnye tehnologii i vyčislitelnye sistemy, no. 4 (2022), pp. 13-24
Voir la notice de l'article provenant de la source Math-Net.Ru
The problem of extracting filling elements (fields) from a recognized image of a document with the help of descriptors – descriptions of one or more structural elements is considered. Structural elements can be words of static text and scribble lines used to shape the design of a document. Business documents with a simplified structure and a limited vocabulary are considered. Flexible business documents that allow significant modifications to the page design are considered. Descriptors are created taking into account a significant number of possible errors in document page recognition. Combined descriptors consisting of several terms and line segments are described. A binding algorithm based on descriptors is given. It is experimentally shown that the extraction of combined descriptors improves the accuracy of recognition of document fields during recognition by 17%, and the accuracy of extracting information from the document image by 16%. The SDK Smart Document Engine was used as OCR in the experiment.
Keywords:
virtual reality, augmented reality, virtual reality helmet, immersiveness, virtual object
Mots-clés : heptic technologies, content.
Mots-clés : heptic technologies, content.
@article{ITVS_2022_4_a1,
author = {O. A. Slavin},
title = {Object descriptors for linking structural elements of noisy document images},
journal = {Informacionnye tehnologii i vy\v{c}islitelnye sistemy},
pages = {13--24},
publisher = {mathdoc},
number = {4},
year = {2022},
language = {ru},
url = {http://geodesic.mathdoc.fr/item/ITVS_2022_4_a1/}
}
O. A. Slavin. Object descriptors for linking structural elements of noisy document images. Informacionnye tehnologii i vyčislitelnye sistemy, no. 4 (2022), pp. 13-24. http://geodesic.mathdoc.fr/item/ITVS_2022_4_a1/