Controllable image generation based on causal representation learning

Artificial intelligence generated content (AIGC) has emerged as an indispensable tool for producing large-scale content in various forms, such as images, thanks to the significant role that AI plays in imitation and production. However, interpretability and controllability remain challenges. Existin...

Full description

Saved in:

Bibliographic Details
Published in:	Frontiers of information technology & electronic engineering Vol. 25; no. 1; pp. 135 - 148
Main Authors:	Huang, Shanshan, Wang, Yuanhao, Gong, Zhili, Liao, Jun, Wang, Shu, Liu, Li
Format:	Journal Article
Language:	English
Published:	Hangzhou Zhejiang University Press 2024 Springer Nature B.V
Subjects:	Artificial intelligence Communications Engineering Computer Hardware Computer Science Computer Systems Organization and Communication Networks Controllability Deep learning Electrical Engineering Electronics and Microelectronics Generative adversarial networks Graphs Image processing Instrumentation Machine learning Methods Modules Networks Optimization Random variables Representations Semantics Causal representation learning 可控图像编辑 Causal structure learning 因果结构学习 Image generation TP391.41 图像生成 Controllable image editing 因果表征学习
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Artificial intelligence generated content (AIGC) has emerged as an indispensable tool for producing large-scale content in various forms, such as images, thanks to the significant role that AI plays in imitation and production. However, interpretability and controllability remain challenges. Existing AI methods often face challenges in producing images that are both flexible and controllable while considering causal relationships within the images. To address this issue, we have developed a novel method for causal controllable image generation (CCIG) that combines causal representation learning with bi-directional generative adversarial networks (GANs). This approach enables humans to control image attributes while considering the rationality and interpretability of the generated images and also allows for the generation of counterfactual images. The key of our approach, CCIG, lies in the use of a causal structure learning module to learn the causal relationships between image attributes and joint optimization with the encoder, generator, and joint discriminator in the image generation module. By doing so, we can learn causal representations in image’s latent space and use causal intervention operations to control image generation. We conduct extensive experiments on a real-world dataset, CelebA. The experimental results illustrate the effectiveness of CCIG.
ISSN:	2095-9184 2095-9230
DOI:	10.1631/FITEE.2300303