llm - a bypan123 Collection

bypan123 's Collections

llm

llm

updated Sep 17

Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation

Paper • 2503.22675 • Published Mar 28 • 36
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback

Paper • 2503.22230 • Published Mar 28 • 45
ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization

Paper • 2509.13313 • Published Sep 16 • 78
WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon Agents

Paper • 2509.13309 • Published Sep 16 • 66
Towards General Agentic Intelligence via Environment Scaling

Paper • 2509.13311 • Published Sep 16 • 70
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning

Paper • 2509.13305 • Published Sep 16 • 88
Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16 • 112
WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research

Paper • 2509.13312 • Published Sep 16 • 104
DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research

Paper • 2505.19253 • Published May 25 • 31