arxiv:2503.04625
ChengpengLi
ChengpengLi
AI & ML interests
LLM for Reasoning, reinforcement learning, recommendation system, diffusion models
Recent Activity
upvoted
a
paper
9 days ago
Agentic Entropy-Balanced Policy Optimization
upvoted
a
paper
27 days ago
Quantile Advantage Estimation for Entropy-Safe Reasoning
upvoted
a
paper
2 months ago
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual
Mathematical Reasoning
Organizations
None yet