UniGenBench-EvalModel-qwen-72b-v1

This model is tailored for offline T2I model evaluation on UniGenBench, which achieves an average accuracy of 94% compared to evaluations by Gemini 2.5 Pro.

Feel free to use this model to assess and compare the performance of your models.

For further details, please refer to the following resources:

📰 Paper: https://arxiv.org/pdf/2508.20751
🪐 Project Page: https://codegoat24.github.io/UnifiedReward/Pref-GRPO
🤗 UniGenBench: https://github.com/CodeGoat24/UniGenBench
🤗 Leaderboard: https://huggingface.co/spaces/CodeGoat24/UniGenBench_Leaderboard
👋 Point of Contact: Yibin Wang

Citation

@article{UniGenBench++,
  title={UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation},
  author={Wang, Yibin and Li, Zhimin and Zang, Yuhang and Bu, Jiazi and Zhou, Yujie and Xin, Yi and He, Junjun and Wang, Chunyu and Lu, Qinglin and Jin, Cheng and Wang, Jiaqi},
  journal={arXiv preprint arXiv:2510.18701},
  year={2025}
}


@article{UniGenBench,
  title={Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning},
  author={Wang, Yibin and Li, Zhimin and Zang, Yuhang and Zhou, Yujie and Bu, Jiazi and Wang, Chunyu and Lu, Qinglin, and Jin, Cheng and Wang, Jiaqi},
  journal={arXiv preprint arXiv:2508.20751},
  year={2025}
}

Downloads last month: 24

Safetensors

Model size

73B params

Tensor type

BF16

F32

Model tree for CodeGoat24/UniGenBench-EvalModel-qwen-72b-v1

Base model

CodeGoat24/UnifiedReward-2.0-qwen-72b

Finetuned

(1)

this model

Dataset used to train CodeGoat24/UniGenBench-EvalModel-qwen-72b-v1

Collection including CodeGoat24/UniGenBench-EvalModel-qwen-72b-v1

Pref-GRPO & UniGenBench

Collection

6 items • Updated about 19 hours ago • 1