Information system for structured documents OCR quality control
Informacionnye tehnologii i vyčislitelnye sistemy, no. 2 (2018), pp. 94-102
Cet article a éte moissonné depuis la source Math-Net.Ru
To date, the computational experiment remains a daily routine procedure during development of machine learning (ML) based software, such as optical character recognition (OCR). Well-known approach of «continuous integration» (CI) is a natural choice for the development of ML software. CI involves frequent centralized program builds and execution of bench tests. This generates a large amount of test results, which should be readily available to developers for error analysis and software version comparison. This article suggests the architecture of the automatic quality control system for the structured documents OCR, including collection, storage and display of bench test results. The results of all software tests are loaded into the database. Builds and bench tests can execute on virtual servers running various operating systems (OS). For stability, the web-server and database use different hardware from the build server. Web technologies are used both for automatic uploading of test results to the database and for servicing user queries.
Keywords:
computer experiment, machine learning, data processing, regression testing, continuous integration, quality control.
Mots-clés : web applications
Mots-clés : web applications
@article{ITVS_2018_2_a7,
author = {P. V. Bezmaternyh and E. L. Pliskin and V. V. Farsobina},
title = {Information system for structured documents {OCR} quality control},
journal = {Informacionnye tehnologii i vy\v{c}islitelnye sistemy},
pages = {94--102},
year = {2018},
number = {2},
language = {ru},
url = {http://geodesic.mathdoc.fr/item/ITVS_2018_2_a7/}
}
TY - JOUR AU - P. V. Bezmaternyh AU - E. L. Pliskin AU - V. V. Farsobina TI - Information system for structured documents OCR quality control JO - Informacionnye tehnologii i vyčislitelnye sistemy PY - 2018 SP - 94 EP - 102 IS - 2 UR - http://geodesic.mathdoc.fr/item/ITVS_2018_2_a7/ LA - ru ID - ITVS_2018_2_a7 ER -
P. V. Bezmaternyh; E. L. Pliskin; V. V. Farsobina. Information system for structured documents OCR quality control. Informacionnye tehnologii i vyčislitelnye sistemy, no. 2 (2018), pp. 94-102. http://geodesic.mathdoc.fr/item/ITVS_2018_2_a7/