Qwen-Image-Edit / CLAUDE.md
tchung1970's picture
Add Korean localized Qwen Image Editor with LFS
9de66b7
|
raw
history blame
1.87 kB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Architecture Overview

This is a Hugging Face Spaces application that provides image editing capabilities using the Qwen-Image-Edit model. The application consists of a single Gradio interface (app.py) that orchestrates:

  1. Image Processing Pipeline: Uses QwenImageEditPipeline from diffusers for the core image editing functionality
  2. Prompt Enhancement: Integrates with DashScope API (qwen-vl-max-latest) to automatically rewrite and enhance user edit instructions using a detailed system prompt
  3. Gradio Interface: Web UI with image upload, text input, and advanced parameter controls

Key Components

  • Main Pipeline: QwenImageEditPipeline loaded from "Qwen/Qwen-Image-Edit" model
  • Prompt Polishing: polish_prompt() function that uses DashScope API to enhance user instructions via a comprehensive system prompt template
  • Inference Function: infer() decorated with @spaces.GPU(duration=120) for GPU acceleration
  • UI Layout: Single-page Gradio interface with examples and advanced settings accordion

Environment Requirements

  • GPU: CUDA-enabled environment preferred (falls back to CPU)
  • API Key: DASH_API_KEY environment variable required for prompt enhancement via DashScope
  • Dependencies: PyTorch, diffusers (git version), transformers, dashscope, gradio

Development Commands

Install dependencies:

pip install -r requirements.txt

Run the application:

python app.py

Configuration Notes

  • Default inference settings: 50 steps, guidance scale 4.0, bfloat16 dtype
  • Hardcoded negative prompt: single space character
  • GPU duration limit: 120 seconds for Spaces environment
  • Examples include three preset image/prompt combinations for testing