---
license: apache-2.0
datasets:
- OmniSVG/MMSVG-Icon
- OmniSVG/MMSVG-Illustration
- OmniSVG/MMSVGBench
language:
- en
- zh
base_model:
- Qwen/Qwen2.5-VL-7B-Instruct
pipeline_tag: text-generation
tags:
- SVG
- Image-to-SVG
- Text-to-SVG
---
OmniSVG: A Unified Scalable Vector Graphics Generation Model
## 1. Introduction
**OmniSVG** is the first family of end-to-end multimodal SVG generators that leverage pre-trained Vision-Language Models (VLMs), capable of generating complex and detailed SVGs, from simple icons to intricate anime characters. We also introduce MMSVG-2M, a multimodal dataset with two million richly annotated SVG assets, along with a standardized evaluation protocol for conditional SVG generation tasks.
## 2. Models Downloading
| Model | Download link | Size | Update date |
|-----------------------------|-------------------------------|------------|------|
| OmniSVG1.1-8B| 🤗 [Huggingface](https://huggingface.co/OmniSVG/OmniSVG1.1_8B) | 17.2 GB | 2025-12-02 |
## 3. Dependencies and Installation
The dependencies configured according to the following instructions provide an environment equipped for inference
### 3.1 Clone the Repository
```bash
git clone https://github.com/OmniSVG/OmniSVG.git
cd OmniSVG
```
### 3.2 Create Conda Environment
Create and activate a new conda environment with Python 3.10:
```bash
conda create -n omnisvg python=3.10
conda activate omnisvg
```
### 3.3 Install Dependencies
#### System Dependencies
Before installing Python packages, you need to install Cairo library which is required by `CairoSVG` in our dependencies:
**macOS:**
```bash
brew install cairo
```
**Linux (Ubuntu/Debian):**
```bash
sudo apt update
sudo apt install libcairo2 libcairo2-dev
```
> **Note:** Installing Cairo system library beforehand helps prevent potential build errors when installing `CairoSVG` via pip.
#### Python Dependencies
We have tested our environment with CUDA 12.1. You can install CUDA 12.1 by following the [CUDA Toolkit installation guide](https://developer.nvidia.com/cuda-12-1-0-download-archive).
Install PyTorch with CUDA 12.1 support:
```bash
pip install torch==2.3.0+cu121 torchvision==0.18.0+cu121 --index-url https://download.pytorch.org/whl/cu121
```
Install remaining dependencies:
```bash
pip install -r requirements.txt
```
## 4. Inference Script
| | GPU Memory Usage | Time per 256/512/1024/2048/4096 tokens |
| ------------------------------------------------ | ---------------- | ----------------- |
| OmniSVG1.1_8B | 26G | 5.38/9.02/20.11/40.34/98.11 seconds |
**Note: The inference time shown here is measured per OmniSVG SVG tokens, while the inference time reported in our paper is measured per XML code tokens for fair comparison with baseline methods.**
### Quick Start
**Download Model Weights**
First, install the Hugging Face CLI tool:
```bash
pip install huggingface-hub
```
**Download the model from Hugging Face:**
```bash
# Download OmniSVG1.1-8B
huggingface-cli download OmniSVG/OmniSVG1.1_8B --local-dir /PATH/TO/OmniSVG1.1_8B
# Download OmniSVG1.1-4B
huggingface-cli download OmniSVG/OmniSVG1.1_4B --local-dir /PATH/TO/OmniSVG1.1_4B
# Download OmniSVG-3B (legacy)
huggingface-cli download OmniSVG/OmniSVG --local-dir /PATH/TO/OmniSVG-3B
```
### Text-to-SVG Generation
**Basic usage - Generate SVG from txt file:**
```bash
python inference.py --task text-to-svg --input prompts.txt --output ./output_text --save-all-candidates
```
**Use 4B model:**
```bash
python inference.py --task text-to-svg --input prompts.txt --output ./output_text --model-size 4B --save-all-candidates
```
**Generate more candidates and save PNG:**
```bash
python inference.py --task text-to-svg --input prompts.txt --output ./output_text \
--num-candidates 8 --save-png --save-all-candidates
```
**Custom generation parameters:**
```bash
python inference.py --task text-to-svg --input prompts.txt --output ./output_text \
--temperature 0.5 --top-p 0.9 --top-k 50 --repetition-penalty 1.05
```
**Use local model:**
```bash
python inference.py --task text-to-svg --input prompts.txt --output ./output_text \
--model-path /path/to/qwen --weight-path /path/to/omnisvg
```
### Image-to-SVG Generation
```bash
python inference.py --task image-to-svg --input ./examples --output ./output_image --save-all-candidates
```
### Interactive Demo
We provide an interactive generation interface using Gradio:
- **Local Deployment**
```bash
python app.py
```
- **Online Demo**
Try our live demo on [Hugging Face Spaces](https://huggingface.co/spaces/OmniSVG/OmniSVG)
## 5. License
OmniSVG is licensed under the [**Apache License 2.0**](https://www.apache.org/licenses/LICENSE-2.0), while MMSVG dataset is under [**Creative Commons Attribution Non Commercial Share Alike 4.0 License**](https://spdx.org/licenses/CC-BY-NC-SA-4.0). You can find the license files in the respective github and HuggingFace repositories.
## Citation
```bibtex
@article{yang2025omnisvg,
title={OmniSVG: A Unified Scalable Vector Graphics Generation Model},
author={Yiying Yang and Wei Cheng and Sijin Chen and Xianfang Zeng and Jiaxu Zhang and Liao Wang and Gang Yu and Xinjun Ma and Yu-Gang Jiang},
journal={arXiv preprint arxiv:2504.06263},
year={2025}
}
```
## Acknowledgments
We thank the following excellent open-source works:
**IconShop**: is the first advanced work that leverages LLMs to generate monochrome, icon-level SVGs. We referred to its parametric implementation.
Here is the list of highly related concurrent works:
**LLM4SVG**: treats SVG coordinates as number strings and predicts decimal part for higher spatial accuracy.
**StarVector**: equips LLM with an image encoder for Image-to-SVG generation.