BrushEdit: All-In-One Image Inpainting and Editing
Abstract
BrushEdit addresses image editing limitations by combining multimodal large language models and dual-branch inpainting models for autonomous and interactive free-form instruction editing.
Image editing has advanced significantly with the development of diffusion models using both inversion-based and instruction-based methods. However, current inversion-based approaches struggle with big modifications (e.g., adding or removing objects) due to the structured nature of inversion noise, which hinders substantial changes. Meanwhile, instruction-based methods often constrain users to black-box operations, limiting direct interaction for specifying editing regions and intensity. To address these limitations, we propose BrushEdit, a novel inpainting-based instruction-guided image editing paradigm, which leverages multimodal large language models (MLLMs) and image inpainting models to enable autonomous, user-friendly, and interactive free-form instruction editing. Specifically, we devise a system enabling free-form instruction editing by integrating MLLMs and a dual-branch image inpainting model in an agent-cooperative framework to perform editing category classification, main object identification, mask acquisition, and editing area inpainting. Extensive experiments show that our framework effectively combines MLLMs and inpainting models, achieving superior performance across seven metrics including mask region preservation and editing effect coherence.
Community
BrushEdit is an advanced, unified AI agent for image inpainting and editing.
Main Elements: 🛠️ Fully automated / 🤠 Interactive editing.
🤗Model Card: https://huggingface.co/TencentARC/BrushEdit
🤗 HF Demo: https://huggingface.co/spaces/TencentARC/BrushEdit
🎨Github Repo: https://github.com/TencentARC/BrushEdit
🎨WebPage: https://liyaowei-stu.github.io/project/BrushEdit/
🎨Arxiv: https://arxiv.org/abs/2412.10316
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- DreamMix: Decoupling Object Attributes for Enhanced Editability in Customized Image Inpainting (2024)
- Instruction-Guided Editing Controls for Images and Multimedia: A Survey in LLM era (2024)
- RAD: Region-Aware Diffusion Models for Image Inpainting (2024)
- PainterNet: Adaptive Image Inpainting with Actual-Token Attention and Diverse Mask Control (2024)
- TrojanEdit: Backdooring Text-Based Image Editing Models (2024)
- InsightEdit: Towards Better Instruction Following for Image Editing (2024)
- Stable Flow: Vital Layers for Training-Free Image Editing (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper