Image vectorization: a review
Zapiski Nauchnykh Seminarov POMI, Investigations on applied mathematics and informatics. Part II–2, Tome 530 (2023), pp. 6-23
Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice du chapitre de livre

Nowadays, there exist many diffusion and autoregressive models that show impressive results for generating images from text and other input domains. However, these methods are not intended for ultra-high-resolution image synthesis. Vector graphics are devoid of this disadvantage, so the generation of images in this format appears to be a very promising direction. Instead of generating vector images directly, one can first synthesize a raster image and then apply vectorization. Vectorization is the process of converting a raster image into a similar vector image using primitive shapes. Besides being similar, the generated vector image is also required to contain a minimal number of shapes for rendering. In this work, we focus specifically on machine learning-compatible vectorization methods. We consider Mang2Vec, Deep Vectorization of Technical Drawings, DiffVG, and LIVE models. We also provide a brief overview of existing online methods. We also recall other algorithmic methods, Im2Vec and ClipGEN models, but they do not participate in the comparison, since there is no open implementation of these methods or their official implementations do not work correctly. Our research shows that despite the ability to directly specify the number and type of shapes, existing machine learning methods take a very long time and do not accurately recreate the original image. We believe that there is no fast universal automatic approach and human control is required for every method.
@article{ZNSL_2023_530_a1,
     author = {M. Dziuba and I. Jarsky and V. Efimova and A. Filchenkov},
     title = {Image vectorization: a review},
     journal = {Zapiski Nauchnykh Seminarov POMI},
     pages = {6--23},
     year = {2023},
     volume = {530},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/ZNSL_2023_530_a1/}
}
TY  - JOUR
AU  - M. Dziuba
AU  - I. Jarsky
AU  - V. Efimova
AU  - A. Filchenkov
TI  - Image vectorization: a review
JO  - Zapiski Nauchnykh Seminarov POMI
PY  - 2023
SP  - 6
EP  - 23
VL  - 530
UR  - http://geodesic.mathdoc.fr/item/ZNSL_2023_530_a1/
LA  - en
ID  - ZNSL_2023_530_a1
ER  - 
%0 Journal Article
%A M. Dziuba
%A I. Jarsky
%A V. Efimova
%A A. Filchenkov
%T Image vectorization: a review
%J Zapiski Nauchnykh Seminarov POMI
%D 2023
%P 6-23
%V 530
%U http://geodesic.mathdoc.fr/item/ZNSL_2023_530_a1/
%G en
%F ZNSL_2023_530_a1
M. Dziuba; I. Jarsky; V. Efimova; A. Filchenkov. Image vectorization: a review. Zapiski Nauchnykh Seminarov POMI, Investigations on applied mathematics and informatics. Part II–2, Tome 530 (2023), pp. 6-23. http://geodesic.mathdoc.fr/item/ZNSL_2023_530_a1/

[1] A. Carlier, M. Danelljan, A. Alahi, and R. Timofte, Deepsvg: A hierarchical generative network for vector graphics animation, 2020, arXiv: 2007.11301

[2] W. Dai, T. Luo, and J. Shen, “Automatic image vectorization using superpixels and random walkers”, 2013 6th International Congress on Image and Signal Processing (CISP), v. 2, 2013, 922–926 | DOI | MR

[3] L. Deng, “The mnist database of handwritten digit images for machine learning research”, IEEE Signal Processing Magazine, 29:6 (2012), 141–142 | DOI

[4] V. Efimova, A. Chebykin, I. Jarsky, E. Prosvirnin, and A. Filchenkov, Neural style transfer for vector graphics, 2023

[5] V. Efimova, I. Jarsky, I. Bizyaev, and A. Filchenkov, Conditional vector graphics generation for music cover images, 2022, arXiv: 2205.07301

[6] V. Egiazarian, O. Voynov, A. Artemov, D. Volkhonskiy, A. Safin, M. Taktasheva, D. Zorin, and E. Burnaev, “Deep vectorization of technical drawings”, Computer Vision–ECCV 2020, 16th European Conference, Proceedings (Glasgow, UK, August 23–28, 2020), v. XIII, Springer, 2020, 582–598 | DOI

[7] P. Esser, R. Rombach, and B. Ommer, “Taming transformers for high-resolution image synthesis”, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, 12873–12883

[8] K. Frans, L. B. Soros, and O. Witkowski, Clipdraw: Exploring text-to-drawing synthesis through language-image encoders, 2021, arXiv: 2106.14843

[9] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks”, Communications of the ACM, 63:11 (2020), 139–144 | DOI

[10] G. J. Hettinga, J. Echevarria, and J. Kosinka, Efficient image vectorisation using mesh colours, The Eurographics Association, 2021

[11] A. Jain, A. Xie, and P. Abbeel, Vectorfusion: Text-to-svg by abstracting pixel-based diffusion models, 2022, arXiv: 2211.11319

[12] S. Jeschke, D. Cline, and P. Wonka, “Estimating color and texture parameters for vector graphics”, Computer Graphics Forum, 30, Wiley Online Library, 2011, 523–532 | DOI

[13] D. P. Kingma and M. Welling, Auto-encoding variational bayes, 2013, arXiv: 1312.6114

[14] An introduction to variational autoencoders, 2019, arXiv: 1906.02691

[15] Y.-K. Lai, S.-M. Hu, and R. R. Martin, “Automatic and topology-preserving gradient mesh generation for image vectorization”, ACM Transactions on Graphics (TOG), 28:3 (2009), 1–8 | DOI | MR

[16] G. Lecot and B. Lévy, “Ardeco: automatic region detection and conversion”, Eurographics Symposium on Rendering, 2006 | Zbl

[17] T.-M. Li, M. Lukáč, “Differentiable vector graphics rasterization for editing and learning”, G. Michaël, and J. Ragan-Kelley, 39:6 (2020), 193:1–193:15

[18] Z. Liao, H. Hoppe, D. Forsyth, and Y. Yu, “A subdivision-based representation for vector image editing”, IEEE transactions on visualization and computer graphics, 18:11 (2012), 1858–1867 | DOI

[19] R. G. Lopes, D. Ha, D. Eck, and J. Shlens, A learned representation for scalable vector graphics, 2019, arXiv: 1904.02632

[20] X. Ma, Y. Zhou, X. Xu, B. Sun, V. Filev, N. Orlov, Y. Fu, and H. Shi, “Towards layer-wise image vectorization”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, 16314–16323

[21] B. Price and W. Barrett, “Object-based vectorization for interactive image editing”, The Visual Computer, 22 (2006), 661–670 | DOI

[22] S. Pun and C. Tsang, Vtracer, 2020

[23] P. Reddy, M. Gharbi, M. Lukac, and N. J. Mitra, “Im2vec: Synthesizing vector graphics without vector supervision”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, 7342–7351

[24] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, 10684–10695

[25] C. Saharia, W. Chan, S. Saxena, L. Li, J. Whang, E. L. Denton, K. Ghasemipour, R. Gontijo Lopes, B. Karagol Ayan, T. Salimans, et al., “Photorealistic text-to-image diffusion models with deep language understanding”, Advances in Neural Information Processing Systems, 35 (2022), 36479–36494

[26] P. Schaldenbrand, Z. Liu, and J. Oh, Styleclipdraw: Coupling content and style in text-to-drawing translation, 2022, arXiv: 2202.12362

[27] I.-C. Shen and B.-Y. Chen, “Clipgen: A deep generative model for clipart vectorization and synthesis”, IEEE Transactions on Visualization and Computer Graphics, 28:12 (2021), 4211–4224 | DOI

[28] H. Su, J. Niu, X. Liu, J. Cui, and J. Wan, Vectorization of raster manga by deep reinforcement learning, 2021, arXiv: 2110.04830

[29] J. Sun, L. Liang, F. Wen, and H.-Y. Shum, “Image vectorization using optimized gradient meshes”, ACM Transactions on Graphics (TOG), 26:3 (2007), 11–es | DOI

[30] S. Swaminarayan and L. Prasad, “Rapid automated polygonal image decomposition”, 35th IEEE Applied Imagery and Pattern Recognition Workshop (AIPR'06), 2006, 28–28

[31] X. Tian and T. Günther, “A survey of smooth vector graphics: Recent advances in representation, creation, rasterization and image vectorization”, IEEE Transactions on Visualization and Computer Graphics, 2022

[32] Y. Wang and Z. Lian, “Deepvecfont: synthesizing high-quality vector fonts via dual-modality learning”, ACM Transactions on Graphics (TOG), 40:6 (2021), 1–15

[33] G. Xie, X. Sun, X. Tong, and D. Nowrouzezahrai, “Hierarchical diffusion curves for accurate automatic image vectorization”, ACM Transactions on Graphics (TOG), 33:6 (2014), 1–11 | DOI

[34] M. Yang, H. Chao, C. Zhang, J. Guo, L. Yuan, and J. Sun, “Effective clipart image vectorization through direct optimization of bezigons”, IEEE Transactions on Visualization and Computer Graphics, 22 (2016), 1063–1075 | DOI

[35] J. Yu, Y. Xu, J. Y. Koh, T. Luong, G. Baid, Z. Wang, V. Vasudevan, A. Ku, Y. Yang, B. K. Ayan, et al., Scaling autoregressive models for content-rich text-to-image generation, 2022, arXiv: 2206.10789

[36] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric”, Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, 586–595

[37] S. Zhao, F. Durand, and C. Zheng, “Inverse diffusion curves using shape optimization”, IEEE transactions on visualization and computer graphics, 24:7 (2017), 2153–2166 | DOI

[38] H. Zhou, J. Zheng, and L. Wei, “Representing images using curvilinear feature driven subdivision surfaces”, IEEE transactions on image processing, 23:8 (2014), 3268–3280 | DOI | MR | Zbl