On using the computer linguistic models in the classification of biomedical images
Matematičeskoe modelirovanie, Tome 35 (2023) no. 12, pp. 18-30.

Voir la notice de l'article provenant de la source Math-Net.Ru

Computer linguistic models have become widespread in the field of natural language processing and have recently been actively used to solve various computer vision problems. In this article, computer studies have been carried out aimed to identify the effectiveness of the use of transformer models in the task of classifying X-ray images of the lungs. The studies used pre-trained models of transformers with different sizes ViT-B(16/32), ViT-L(16/32), which were then fine-tuned on a set of X-ray images of lung. Computer studies of the use of convolutional neural networks VGG-16, Inception V3, ResNet50, EfficientNetV2, DenseNet121 have also been conducted. A comparative analysis of the classification results of the studied X-ray images showed that the ViT-B/32 transformer model has the best accuracy metrics accuracy=97.56%, AUC=99%.
Keywords: transformers, deep convolutional networks, lungs X-ray images.
Mots-clés : classification
@article{MM_2023_35_12_a1,
     author = {E.Yu.Shchetinin},
     title = {On using the computer linguistic models in the classification of biomedical images},
     journal = {Matemati\v{c}eskoe modelirovanie},
     pages = {18--30},
     publisher = {mathdoc},
     volume = {35},
     number = {12},
     year = {2023},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/MM_2023_35_12_a1/}
}
TY  - JOUR
AU  - E.Yu.Shchetinin
TI  - On using the computer linguistic models in the classification of biomedical images
JO  - Matematičeskoe modelirovanie
PY  - 2023
SP  - 18
EP  - 30
VL  - 35
IS  - 12
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/MM_2023_35_12_a1/
LA  - ru
ID  - MM_2023_35_12_a1
ER  - 
%0 Journal Article
%A E.Yu.Shchetinin
%T On using the computer linguistic models in the classification of biomedical images
%J Matematičeskoe modelirovanie
%D 2023
%P 18-30
%V 35
%N 12
%I mathdoc
%U http://geodesic.mathdoc.fr/item/MM_2023_35_12_a1/
%G ru
%F MM_2023_35_12_a1
E.Yu.Shchetinin. On using the computer linguistic models in the classification of biomedical images. Matematičeskoe modelirovanie, Tome 35 (2023) no. 12, pp. 18-30. http://geodesic.mathdoc.fr/item/MM_2023_35_12_a1/

[1] Y. Liu, Y. Zhang, Y. Wang, F. Hou, A Survey of Visual Transformers, 2020, arXiv: 2111.06091v1

[2] E. U. Henrya, O. Emebob, A. C. Omonhinmin, Vision Transformers in Medical Imaging: A Review, 2020, arXiv: 2211.10043

[3] B. Wu, C. Xu, X. Dai, X. Wan, P. Zhang, Z. Yan, etc, Visual Transformers: Token-based Image Representation and Proc. for Comp. Vision, 2020, arXiv: 2006.03677v4 [cs.CV]

[4] C. Matsoukas, J. F. Haslum, M. Soderberg, K. Smith, Is it Time to Replace CNNs with Transformers for Medical Images?, 2021, arXiv: 2108.09038v1

[5] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner et al, An image is worth 16?16 words: Transformers for image recognition at scale, 2021, arXiv: 2010.11929

[6] Badri N. Patro, Vijay Srinivas Agneeswaran. Efficiency 360: Efficient Vision Transformers, 2023, arXiv: 2302.08374v2

[7] M. Raghu, T. Unterthiner, S. Kornblith, C. Zhang, A. Dosovitskiy, Do vision transformers see like convolutional neural networks?, 2021, arXiv: 2108.08810

[8] H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, “Training data-efficient image transformers and distillation through attention”, Intern. Conf. on Machine Learning, PMLR, 2021, 10347–10357

[9] K. He, C. Gan, Z. Li, I. Rekik, Z. Yin, W. Ji, Y. Gao, Q. Wang, J. Zhang, “Transformers in medical image analysis”, Intelligent Medicine, 3 (2023), 59–78 | DOI

[10] D. Shome, T. Kar, S. N. Mohanty, P. Tiwari, K. Muhammad, A. AlTameem, Y. Zhang, “COVID Transformer: Interpretable COVID-19 Detection Using Vision Transformer for Healthcare”, Intern. J. of Environmental Research and Public Health, 18 (2021), 11086 | DOI

[11] K. Krishnan, Vision Transformer based COVID-19 Detection using Chest X-rays, 2021, arXiv: 2110.04458v1

[12] L. Balderas, M. Lastra, A. Lainez-Ramos-Bossini, J. Benitez, “COVID-ViT: COVID-19 De-tection Method Based on Vision Transformers”, International Conf. on Intelligent Systems Design and Applications, 2023, 81–90

[13] S. Park et al, Vision transformer for COVID-19 CXR diagnosis using chest X-ray feature corpus, 2021, arXiv: 2103.07055 | MR

[14] M. Chetoui, M. Akhloufi, “Explainable vision transformers and radiomics for COVID-19 de-tection in chest X-rays”, J. Clin. Med., 11 (2022), 3013 | DOI

[15] A. Marefat, M. Marefat, J. H. Joloudari, M. A. Nematollahi, R. Lashgari, “CCTCOVID: COVID-19 detection from chest X-ray images using compact convolutional transformers”, Sec. Digital Public Health, 11 (2023)

[16] S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, Transformers in vision: A survey, 2021, arXiv: 2101.01169

[17] Murphy Kevin P., Probabilistic machine learning, MIT Press Cambridge, MA, Cambridge, 2021

[18] Chest X-ray images for the detection of COVID-19, https://github.com/lindawangg/COVID-Net

[19] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, 2014, arXiv: 1409.1556 | Zbl

[20] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, “Rethinking the inception architec-ture for computer vision”, Proc. of the IEEE Conf. on Comp. Vision Pattern Recognition (CVPR), 2016, 2818–2826

[21] K. He, X. Zhang, S. Ren, J. Sun, “Deep residual learning for image recognition”, Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, 770–778

[22] M. Tan, V. Quoc Le, EfficientNetV2: Smaller Models and Faster Training, 2021, arXiv: 2104.00298v3 | MR

[23] ViT-Keras library, https://pypi.org/project/vit-keras/

[24] E. Iu. Shchetinin, “Obnaruzhenie koronovirusnoi infektsii COVID-19 na osnove analiza rentgenovskikh snimkov grudnoi kletki metodami glubokogo obucheniia”, Komputernaia optika, 46:6 (2022), 963–970

[25] N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, S. Zagoruyko, “End-to-end object detection with transformers”, European Conf. on Comp. Vision, 2020, 213–229

[26] A. Steiner, A. Kolesnikov, X. Zhai, R. Wightman, J. Uszkoreit, L. Beyer, How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers, 2021, arXiv: 2106.10270

[27] T. Ridnik, E. Ben-Baruch, A. Noy, L. Zelnik-Manor, Imagenet-21K pretraining for the masses, 2021, arXiv: 2104.10972