DDIMCache: An enhanced text-to-image diffusion model on mobile devices
Kybernetika, Tome 60 (2024) no. 6, pp. 819-833
Cet article a éte moissonné depuis la source Czech Digital Mathematics Library
On June 11, 2024, OpenAI announced a collaboration with Apple to deeply integrate the ChatGPT generative language model into Apple's product lineup. With support from various generative AI models, devices like smartphones will become more intelligent. The text-to-image diffusion model, known for its stable and superior generative capabilities, has gained wide recognition in image generation and will undoubtedly play a crucial role on mobile devices. However, the large size and complex architecture of diffusion models result in high computational costs and slower execution speeds. As a result, diffusion models require high-end GPUs or cloud-based inference, which often raises personal privacy and data security. This paper presents a multiplicative effect joint optimization method for complex models such as diffusion models, enabling efficient execution on mobile devices. The method integrates multiple optimization strategies, leveraging their interactions to create synergies and enhance overall performance. Building on this multiplicative effect joint optimization approach, we have introduced DDIMCache, an enhanced text-to-image diffusion model. DDIMCache maintains image generation quality while achieving optimal speed, generating 512-512 images in approximately 6 seconds. This provides powerful image generation capabilities and an enhanced user experience for mobile users.In addition, as a foundation model, Stable Diffusion supports more applications such as image editing, inpainting, style transfer, and super-resolution, all of which can have a significant impact. The ability to run the model entirely on mobile devices without an internet connection will open up endless possibilities.
On June 11, 2024, OpenAI announced a collaboration with Apple to deeply integrate the ChatGPT generative language model into Apple's product lineup. With support from various generative AI models, devices like smartphones will become more intelligent. The text-to-image diffusion model, known for its stable and superior generative capabilities, has gained wide recognition in image generation and will undoubtedly play a crucial role on mobile devices. However, the large size and complex architecture of diffusion models result in high computational costs and slower execution speeds. As a result, diffusion models require high-end GPUs or cloud-based inference, which often raises personal privacy and data security. This paper presents a multiplicative effect joint optimization method for complex models such as diffusion models, enabling efficient execution on mobile devices. The method integrates multiple optimization strategies, leveraging their interactions to create synergies and enhance overall performance. Building on this multiplicative effect joint optimization approach, we have introduced DDIMCache, an enhanced text-to-image diffusion model. DDIMCache maintains image generation quality while achieving optimal speed, generating 512-512 images in approximately 6 seconds. This provides powerful image generation capabilities and an enhanced user experience for mobile users.In addition, as a foundation model, Stable Diffusion supports more applications such as image editing, inpainting, style transfer, and super-resolution, all of which can have a significant impact. The ability to run the model entirely on mobile devices without an internet connection will open up endless possibilities.
DOI :
10.14736/kyb-2024-6-0819
Classification :
68T01
Keywords: diffusion model; text-to-image; mobile devices
Keywords: diffusion model; text-to-image; mobile devices
@article{10_14736_kyb_2024_6_0819,
author = {Qifeng, Wu},
title = {DDIMCache: {An} enhanced text-to-image diffusion model on mobile devices},
journal = {Kybernetika},
pages = {819--833},
year = {2024},
volume = {60},
number = {6},
doi = {10.14736/kyb-2024-6-0819},
zbl = {07980824},
language = {en},
url = {http://geodesic.mathdoc.fr/articles/10.14736/kyb-2024-6-0819/}
}
TY - JOUR AU - Qifeng, Wu TI - DDIMCache: An enhanced text-to-image diffusion model on mobile devices JO - Kybernetika PY - 2024 SP - 819 EP - 833 VL - 60 IS - 6 UR - http://geodesic.mathdoc.fr/articles/10.14736/kyb-2024-6-0819/ DO - 10.14736/kyb-2024-6-0819 LA - en ID - 10_14736_kyb_2024_6_0819 ER -
Qifeng, Wu. DDIMCache: An enhanced text-to-image diffusion model on mobile devices. Kybernetika, Tome 60 (2024) no. 6, pp. 819-833. doi: 10.14736/kyb-2024-6-0819
Cité par Sources :