--- license: apache-2.0 --- ## X-Omni-En (support English text rendering)

🏠 Project Page | 📄 Paper | 💻​ Code | 🚀 HuggingFace Space

## 🌟 Highlights - **Unified Modeling Approach**: A discrete autoregressive model handling image and language modalities. - **Superior Instruction Following**: Exceptional capability to follow complex instructions. - **Superior Text Rendering**: Accurately render text in English. - **Arbitrary resolutions**: Produces aesthetically pleasing images at arbitrary resolutions.

## 📖 Citation If you find this project helpful for your research or use it in your own work, please cite our paper: ```bibtex @article{geng2025xomni, author = {Zigang Geng, Yibing Wang, Yeyao Ma, Chen Li, Yongming Rao, Shuyang Gu, Zhao Zhong, Qinglin Lu, Han Hu, Xiaosong Zhang, Linus, Di Wang and Jie Jiang}, title = {X-Omni: Reinforcement Learning Makes Discrete Autoregressive Image Generative Models Great Again}, journal = {CoRR}, volume = {abs/2507.22058}, year = {2025}, } ```