A method to reduce errors of string recognition based on combination of several recognition results with per-character alternatives

K. B. Bulatov

K. B. Bulatov

Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie, Tome 12 (2019) no. 3, pp. 74-88

Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

Résumé

We consider the problem on recognition of a string object presented in several video stream frames. In order to maximize the output accuracy, we combine several results of the recognition. To this end, we consider a model of result of a string object recognition. The model takes into account the estimations of alternative results of per-character classification. Also, we propose an algorithm to combine results of a string recognition according to this model. The algorithm was evaluated on a MIDV-500 dataset of document images. The experimental results show that the proposed algorithm allows to achieve the high accuracy of recognition result due to an analysis of several images, and the use of the estimations of alternative results of per-character classification gives the higher results then a combination of strings that contain only the final alternatives of each character.

Keywords: recognition in video stream, recognition algorithms.
Mots-clés : mobile OCR

@article{VYURU_2019_12_3_a6,
     author = {K. B. Bulatov},
     title = {A method to reduce errors of string recognition based on combination of several recognition results with per-character alternatives},
     journal = {Vestnik \^U\v{z}no-Uralʹskogo gosudarstvennogo universiteta. Seri\^a, Matemati\v{c}eskoe modelirovanie i programmirovanie},
     pages = {74--88},
     year = {2019},
     volume = {12},
     number = {3},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/VYURU_2019_12_3_a6/}
}

TY  - JOUR
AU  - K. B. Bulatov
TI  - A method to reduce errors of string recognition based on combination of several recognition results with per-character alternatives
JO  - Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie
PY  - 2019
SP  - 74
EP  - 88
VL  - 12
IS  - 3
UR  - http://geodesic.mathdoc.fr/item/VYURU_2019_12_3_a6/
LA  - en
ID  - VYURU_2019_12_3_a6
ER  -

%0 Journal Article
%A K. B. Bulatov
%T A method to reduce errors of string recognition based on combination of several recognition results with per-character alternatives
%J Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie
%D 2019
%P 74-88
%V 12
%N 3
%U http://geodesic.mathdoc.fr/item/VYURU_2019_12_3_a6/
%G en
%F VYURU_2019_12_3_a6

K. B. Bulatov. A method to reduce errors of string recognition based on combination of several recognition results with per-character alternatives. Vestnik Ûžno-Uralʹskogo gosudarstvennogo universiteta. Seriâ, Matematičeskoe modelirovanie i programmirovanie, Tome 12 (2019) no. 3, pp. 74-88. http://geodesic.mathdoc.fr/item/VYURU_2019_12_3_a6/

Bibliographie
Cité par

[1] Bulatov K., Arlazarov V. V., Chernov T. et al., “Smart IDReader: Document Recognition in Video Stream”, Proceeding 14th International Conference on Document Analysis and Recogntiion, v. 6, 2017, 39–44 | DOI

[2] Burie J.-C., Chazalon J., Coustaty M. et al., “ICDAR 2015 Competition on Smartphone Document Capture and OCR”, Proceeding 13th International Conference on Document Analaysis and Recognition, 2015, 1161–1165 | DOI

[3] Puybareau E., Geraud T., “Real-Time Document Detection in Smartphone Videos”, Proceeding 25th IEEE International Conference on Image Processing, 2018, 1498–1502 | DOI

[4] Arlazarov V. V., Zhukovsky A., Krivtsov V et al., “Analysis of Using Stationary and Mobile Small-Scale Digital Video Cameras for Document Recognition”, Information Technologies and Computation Systems, 2014, no. 3, 71–78 (in Russian)

[5] Chernov T., Kolmakov S., Nikolaev D., “An Algorithm for Detection and Phase Estimation of Protective Elements Periodic Lattice on Document Image”, Pattern Recognition and Image Analysis, 27:1 (2017), 53–65 | DOI

[6] Arlazarov V. V., Bulatov K., Chernov T., Arlazarov V. L., A Dataset for Identity Documents Analysis and Recognition on Mobile Devices in Video Stream, 2018, arXiv: 1807.05786 | Zbl

[7] Kittler J., Hatef M., Duin R. P. W., Matas J., “On Combining Classifiers”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20:3 (1998), 226–239 | DOI

[8] Kuncheva L. I., Bezdek J. C., Duin R. P. W., “Decision Templates for Multiple Classifier Fusion: an Experimental Comparison”, Pattern Recognition, 34:2 (2001), 299–314 | DOI | Zbl

[9] Fiscus J. G., “A Post-Processing System to Yield Reduced Word Error Rates: Recognizer Output Voting Error Reduction (ROVER)”, Proceeding IEEE Workshop on Automatic Speech Recognition and Understanding, 1997, 347–354

[10] Wemhoener D., Yalniz I. Z., Manmatha R., “Creating an Improved Version Using Noisy OCR from Multiple Editions”, Proceeding 12th International Conference on Document Analysis and Recognition (ICDAR), 2013, 160–164 | DOI

[11] Stuner B., Chatelain C., Paquet T., LV-ROVER: Lexicon Verified Recognizer Output Voting Error Reduction, 2017, arXiv: 1707.07432

[12] Llobet R., Cerdan-Navarro J.-R., Perez-Cortes J.-C., Arlandis J., “OCR Post-Processing Using Weighted Finite-State Transducers”, Proceeding 20th International Conference on Pattern Recognition, 2010, 2021–2024 | DOI

[13] Bulatov K. B., Kirsanov V. Yu., Arlazarov V. V. et al., “Methods of Recognition Results Integration for Document Text Fields in a Video Dtream of a Mobile Device”, Bulletin of the Russian Foundation for Basic Research, 92:4 (2016), 109–115 (in Russian) | DOI

[14] Pattern Recognition. Classification. Forecasting. Mathematical Tecniques and Their Application, Nauka, M., 1989 (in Russian)

[15] Krizhevsky A., Sutskever I., Hinton G. E., “ImageNet Classification with Deep Convolutional Neural Networks”, NIPS'12 Proceedings of the 25th International Conference on Neural Information Processing Systems, v. 1, 2015, 1097–1105

[16] Sankoff D., Kruskal J., Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, Center for the Study of Language and Information, Stanford, 1999 | MR

[17] Yujian L., Bo L., “A Normalized Levenshtein Distance Metric”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29:6 (2007), 1091–1095 | DOI

[18] Ing-Jr Ding, Chih-Ta Yen, Yen-Ming Hsu, “Developments of Machine Learning Schemes for Dynamic Time-Wrapping-Based Speech Recognition”, Mathematical Problems in Engineering, 2013, 542680, 10 pp. | DOI | Zbl

[19] Casenave T., “Overestimation for Multiple Sequence Alignment”, IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology (CIBCB), 2007, 159–164 | DOI

[20] Zilbershtein S., “Using Anytime Algorithms in Intelligent Systems”, AI Magazine, 17 (1996), 73–83

Parcourir par

Geodesic

Parcourir par