X-Omni
/

X-Omni-En

Model card Files Files and versions

X-Omni-En / README.md

zhangxiaosong18's picture

zhangxiaosong18

Update README.md

963f797 verified 4 months ago

|

1.47 kB

	---
	license: apache-2.0
	---
	## X-Omni-En (support English text rendering)

	<p align="left">
	<a href="https://x-omni-team.github.io">🏠 Project Page</a> \|
	<a href="https://arxiv.org/pdf/2507.22058">📄 Paper</a> \|
	<a href="https://github.com/X-Omni-Team/X-Omni">💻 Code</a> \|
	<a href="https://huggingface.co/collections/X-Omni/x-omni-spaces-6888c64f38446f1efc402de7">🚀 HuggingFace Space</a>
	</p>

	## 🌟 Highlights

	- Unified Modeling Approach: A discrete autoregressive model handling image and language modalities.
	- Superior Instruction Following: Exceptional capability to follow complex instructions.
	- Superior Text Rendering: Accurately render text in English.
	- Arbitrary resolutions: Produces aesthetically pleasing images at arbitrary resolutions.

	<p align="left">
	<img src="assets/fig2-1.png" alt="" width="600" />
	<img src="assets/fig5-1.png" alt="" width="600" />
	</p>

	## 📖 Citation

	If you find this project helpful for your research or use it in your own work, please cite our paper:
	```bibtex
	@article{geng2025xomni,
	author = {Zigang Geng, Yibing Wang, Yeyao Ma, Chen Li, Yongming Rao, Shuyang Gu, Zhao Zhong, Qinglin Lu, Han Hu, Xiaosong Zhang, Linus, Di Wang and Jie Jiang},
	title = {X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again},
	journal = {CoRR},
	volume = {abs/None},
	year = {2025},
	}
	```