MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices
The deployment of large-scale text-to-image diffusion models on mobile devices is impeded by their substantial model size and slow inference speed. In this paper, we propose \textbf{MobileDiffusion}, a highly efficient text-to-image diffusion model obtained through extensive optimizations in both ar...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
28-11-2023
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The deployment of large-scale text-to-image diffusion models on mobile
devices is impeded by their substantial model size and slow inference speed. In
this paper, we propose \textbf{MobileDiffusion}, a highly efficient
text-to-image diffusion model obtained through extensive optimizations in both
architecture and sampling techniques. We conduct a comprehensive examination of
model architecture design to reduce redundancy, enhance computational
efficiency, and minimize model's parameter count, while preserving image
generation quality. Additionally, we employ distillation and diffusion-GAN
finetuning techniques on MobileDiffusion to achieve 8-step and 1-step inference
respectively. Empirical studies, conducted both quantitatively and
qualitatively, demonstrate the effectiveness of our proposed techniques.
MobileDiffusion achieves a remarkable \textbf{sub-second} inference speed for
generating a $512\times512$ image on mobile devices, establishing a new state
of the art. |
---|---|
DOI: | 10.48550/arxiv.2311.16567 |