Edit-R1
Collection
5 items
•
Updated
•
6
This model is a checkpoint (Edit-R1-Qwen-Image-Edit-2509
) developed using the Edit-R1 framework, as presented in the paper Uniworld-V2: Reinforce Image Editing with Diffusion Negative-aware Finetuning and MLLM Implicit Feedback. The Edit-R1 framework focuses on reinforcing instruction-based image editing through Diffusion Negative-aware Finetuning (DiffusionNFT) and MLLM Implicit Feedback.
Benchmark | Qwen-Image-Edit-2509 | Edit-R1-Qwen-Image-Edit-2509 |
---|---|---|
GEdit-Bench | 7.54 | 7.76 |
ImgEdit | 4.35 | 4.48 |
import os
import torch
from PIL import Image
from diffusers import QwenImageEditPlusPipeline
pipeline = QwenImageEditPlusPipeline.from_pretrained("Qwen/Qwen-Image-Edit-2509", torch_dtype=torch.bfloat16)
print("pipeline loaded")
pipeline.load_lora_weights(
"chestnutlzj/Edit-R1-Qwen-Image-Edit-2509",
adapter_name="lora",
)
pipeline.set_adapters(["lora"], adapter_weights=[1])
pipeline.to('cuda')
pipeline.set_progress_bar_config(disable=None)
image1 = Image.open("input1.png")
image2 = Image.open("input2.png")
prompt = "The magician bear is on the left, the alchemist bear is on the right, facing each other in the central park square."
inputs = {
"image": [image1, image2],
"prompt": prompt,
"generator": torch.manual_seed(0),
"true_cfg_scale": 4.0,
"negative_prompt": " ",
"num_inference_steps": 40,
"guidance_scale": 1.0,
"num_images_per_prompt": 1,
}
with torch.inference_mode():
output = pipeline(**inputs)
output_image = output.images[0]
output_image.save("output_image_edit_plus.png")
print("image saved at", os.path.abspath("output_image_edit_plus.png"))