--- title: DeepSeek OCR PDF emoji: 🏃 colorFrom: indigo colorTo: pink sdk: gradio sdk_version: 5.49.1 app_file: app.py pinned: false short_description: OCR interface for your PDF files --- # DeepSeek-OCR PDF & Image Interface This Space wraps [`deepseek-ai/DeepSeek-OCR`](https://huggingface.co/deepseek-ai/DeepSeek-OCR) with a polished Gradio UI that can transcribe both individual images and multi-page PDFs into clean Markdown. It targets the free T4 GPU tier for fast startup while enabling flash-attention and optional vLLM acceleration for multi-page batching. ## Features - Support for `.png`, `.jpg`, `.jpeg`, `.webp`, `.tiff`, and `.pdf` - Automatic PDF page conversion with PyMuPDF at 192 DPI - Gundam mode defaults (`base_size=1024`, `image_size=640`, `crop_mode=True`) for balanced speed and accuracy - Markdown-formatted output with per-page sections - Optional custom prompt to tailor extraction instructions ## Running Locally ```bash python -m venv .venv source .venv/bin/activate pip install -r requirements.txt python app.py ``` The interface launches on `http://127.0.0.1:7860` by default. Set the environment variable `USE_VLLM=0` to disable the vLLM backend or leave it enabled to leverage faster batching when the dependency is available. ## Space Configuration - **Hardware**: `t4-small` - **Python**: `3.10` - **SDK**: `Gradio 5.49.1` - **Model**: `deepseek-ai/DeepSeek-OCR` Refer to the [Spaces configuration reference](https://huggingface.co/docs/hub/spaces-config-reference) for additional customization options.