Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation Paper • 2503.22675 • Published Mar 28 • 36
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback Paper • 2503.22230 • Published Mar 28 • 45
ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization Paper • 2509.13313 • Published Sep 16 • 78
WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon Agents Paper • 2509.13309 • Published Sep 16 • 66
Towards General Agentic Intelligence via Environment Scaling Paper • 2509.13311 • Published Sep 16 • 70
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning Paper • 2509.13305 • Published Sep 16 • 88
WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research Paper • 2509.13312 • Published Sep 16 • 104
DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research Paper • 2505.19253 • Published May 25 • 31