Keywords: diffusion model; text-to-image; mobile devices
@article{10_14736_kyb_2024_6_0819,
author = {Qifeng, Wu},
title = {DDIMCache: {An} enhanced text-to-image diffusion model on mobile devices},
journal = {Kybernetika},
pages = {819--833},
year = {2024},
volume = {60},
number = {6},
doi = {10.14736/kyb-2024-6-0819},
zbl = {07980824},
language = {en},
url = {http://geodesic.mathdoc.fr/articles/10.14736/kyb-2024-6-0819/}
}
TY - JOUR AU - Qifeng, Wu TI - DDIMCache: An enhanced text-to-image diffusion model on mobile devices JO - Kybernetika PY - 2024 SP - 819 EP - 833 VL - 60 IS - 6 UR - http://geodesic.mathdoc.fr/articles/10.14736/kyb-2024-6-0819/ DO - 10.14736/kyb-2024-6-0819 LA - en ID - 10_14736_kyb_2024_6_0819 ER -
Qifeng, Wu. DDIMCache: An enhanced text-to-image diffusion model on mobile devices. Kybernetika, Tome 60 (2024) no. 6, pp. 819-833. doi: 10.14736/kyb-2024-6-0819
[1] Rombach, R., Blattmann, A., al., D. Lorenz et: High-resolution image synthesis with latent diffusion models. In: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022, pp. 10684-10695. | DOI
[2] Hou, J., Asghar, Z.: World's first on-device demonstration of stable diffusion on an android phone. Qualcomm 24 (2023). DOI
[3] Sarokin, Y. H. Chenm R., al., J. Lee et: Speed is all you need: On-device acceleration of large diffusion models via gpu-aware optimizations. In: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023. pp. 4651-4655. | DOI
[4] Shang, Y., al., Z. Yuan et: Post-training quantization on diffusion models. In: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023. pp. 1972-1981. | DOI
[5] Li, X., Liu, Y., al., L. Lian et: Q-diffusion: Quantizing diffusion models. In: Proc. IEEE/CVF International Conference on Computer Vision 2023: pp. 17535-17545. | DOI
[6] Ma, X., Fang, G., X.Wang: Llm-pruner: On the structural pruning of large language models. Adv. Neural Inform. Process. Systems 36 (2023), 21702-21720.
[7] Li, Y., Yuan, G., al., Y. Wen et: Efficientformer: Vision transformers at mobilenet speed. Adv. Neural Inform. Process. Systems 35 (2022), 12934-12949.
[8] Sohl-Dickstein, J., Weiss, E., al., N. Maheswaranathan et: Deep unsupervised learning using nonequilibrium thermodynamics. In: International Conference on Machine Learning PMLR, 2015, pp. 2256-2265.
[9] Song, Jiaming, Meng, Chenlin, Ermon, Stefano: Denoising diffusion implicit models. 2020. In: arXiv preprint: DOI
[10] Jain, S. M.: Hugging face. Introduction to transformers for NLP: With the hugging face library and models to solve problems. Apress, Berkeley 2022, 51-67. | DOI
[11] Ronneberger, O., Fischer, P., U-net, T. Brox: Convolutional networks for biomedical image segmentation. Medical image computing and computer-assisted interventional MICCAI 2015. In: Proc. 18th international conference, Munich 2015, part III 18. Springer International Publishing, pp. 234-241.
[12] Lin, T. Y., Maire, M., al., S. Belongie et: Microsoft coco: Common objects in context. Computer Vision'ECCV 2014. In: Proc. 13th European Conference, Zurich 2014, Part V 13. Springer International Publishing 2014, pp. 740-755.
[13] Nichol, A. Q., Dhariwal, P.: Improved denoising diffusion probabilistic models. In: International Conference on Machine Learning, PMLR 2021, pp. 8162-8171.
Cité par Sources :