---
title: DeepSeek OCR PDF
emoji: 🏃
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
short_description: OCR interface for your PDF files
---

# DeepSeek-OCR PDF & Image Interface

This Space wraps [`deepseek-ai/DeepSeek-OCR`](https://huggingface.co/deepseek-ai/DeepSeek-OCR) with a polished Gradio UI that can transcribe both individual images and multi-page PDFs into clean Markdown. It targets the free T4 GPU tier for fast startup while enabling flash-attention and optional vLLM acceleration for multi-page batching.

## Features

- Support for `.png`, `.jpg`, `.jpeg`, `.webp`, `.tiff`, and `.pdf`
- Automatic PDF page conversion with PyMuPDF at 192 DPI
- Gundam mode defaults (`base_size=1024`, `image_size=640`, `crop_mode=True`) for balanced speed and accuracy
- Markdown-formatted output with per-page sections
- Optional custom prompt to tailor extraction instructions

## Running Locally

```bash
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python app.py
```

The interface launches on `http://127.0.0.1:7860` by default. Set the environment variable `USE_VLLM=0` to disable the vLLM backend or leave it enabled to leverage faster batching when the dependency is available.

## Space Configuration

- **Hardware**: `t4-small`
- **Python**: `3.10`
- **SDK**: `Gradio 5.49.1`
- **Model**: `deepseek-ai/DeepSeek-OCR`

Refer to the [Spaces configuration reference](https://huggingface.co/docs/hub/spaces-config-reference) for additional customization options.