DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing
Abstract
DragDiffusion extends interactive point-based image editing to diffusion models, offering precise spatial control and high-quality editing across diverse scenarios.
Precise and controllable image editing is a challenging task that has attracted significant attention. Recently, DragGAN enables an interactive point-based image editing framework and achieves impressive editing results with pixel-level precision. However, since this method is based on generative adversarial networks (GAN), its generality is upper-bounded by the capacity of the pre-trained GAN models. In this work, we extend such an editing framework to diffusion models and propose DragDiffusion. By leveraging large-scale pretrained diffusion models, we greatly improve the applicability of interactive point-based editing in real world scenarios. While most existing diffusion-based image editing methods work on text embeddings, DragDiffusion optimizes the diffusion latent to achieve precise spatial control. Although diffusion models generate images in an iterative manner, we empirically show that optimizing diffusion latent at one single step suffices to generate coherent results, enabling DragDiffusion to complete high-quality editing efficiently. Extensive experiments across a wide range of challenging cases (e.g., multi-objects, diverse object categories, various styles, etc.) demonstrate the versatility and generality of DragDiffusion.
Community
Wow, I'm considering changing GAN to diffusion method these days, and the result will be there. Congratulations.
Congratulations!
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
 AK
							AK 
					 
					 
						 
						