Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss Paper • 2512.23447 • Published 3 days ago • 83
Baichuan-M2: Scaling Medical Capability with Large Verifier System Paper • 2509.02208 • Published Sep 2, 2025 • 42
MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion Paper • 2503.16212 • Published Mar 20, 2025 • 25
Reinforcing Multimodal Understanding and Generation with Dual Self-rewards Paper • 2506.07963 • Published Jun 9, 2025 • 1
The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason Paper • 2505.22653 • Published May 28, 2025 • 43