MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization Paper • 2510.08540 • Published 21 days ago • 108
VideoScore2: Think before You Score in Generative Video Evaluation Paper • 2509.22799 • Published Sep 26 • 24
Critique-Coder: Enhancing Coder Models by Critique Reinforcement Learning Paper • 2509.22824 • Published Sep 26 • 20