-
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning
Paper • 2502.14768 • Published • 47 -
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
Paper • 2502.12853 • Published • 29 -
Diverse Inference and Verification for Advanced Reasoning
Paper • 2502.09955 • Published • 18 -
Distillation Scaling Laws
Paper • 2502.08606 • Published • 48
shanshan wang
cooleel
AI & ML interests
None yet
Recent Activity
updated
a collection
13 days ago
DocAI
liked
a Space
about 2 months ago
AIDC-AI/Ovis2-4B
published
a model
3 months ago
tensorlake/MonkeyOCR-pro-1.2B-recognition