Analysis of the usage of problem-oriented datasets in scientific research
Informacionnye tehnologii i vyčislitelnye sistemy, no. 3 (2022), pp. 10-23
Voir la notice de l'article provenant de la source Math-Net.Ru
In this paper we consider the problem of creating and using open problem-oriented datasets to facilitate verifyable and reproducible research, based on the study of the usage of MIDV family of datasets, which contain images and video sequences of identity documents. An analysis is presented of published scientific works in the fields of computer vision, image processing, and computational linguistics, which use these datasets. Main problems are described which were tackled by the research groups, and general principles are formulated, which could be used for creating and expanding the datasets of this class.
Keywords:
text recognition, document analysis, datasets, reproducible research, image processing.
@article{ITVS_2022_3_a1,
author = {V. V. Arlazarov},
title = {Analysis of the usage of problem-oriented datasets in scientific research},
journal = {Informacionnye tehnologii i vy\v{c}islitelnye sistemy},
pages = {10--23},
publisher = {mathdoc},
number = {3},
year = {2022},
language = {ru},
url = {http://geodesic.mathdoc.fr/item/ITVS_2022_3_a1/}
}
V. V. Arlazarov. Analysis of the usage of problem-oriented datasets in scientific research. Informacionnye tehnologii i vyčislitelnye sistemy, no. 3 (2022), pp. 10-23. http://geodesic.mathdoc.fr/item/ITVS_2022_3_a1/