THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning Paper • 2509.13761 • Published Sep 17 • 16
NER Retriever: Zero-Shot Named Entity Retrieval with Type-Aware Embeddings Paper • 2509.04011 • Published Sep 4 • 28
DeepResearch Arena: The First Exam of LLMs' Research Abilities via Seminar-Grounded Tasks Paper • 2509.01396 • Published Sep 1 • 56
Robix: A Unified Model for Robot Interaction, Reasoning and Planning Paper • 2509.01106 • Published Sep 1 • 48
Provable Benefits of In-Tool Learning for Large Language Models Paper • 2508.20755 • Published Aug 28 • 11
DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization Paper • 2508.14460 • Published Aug 20 • 82
Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments Paper • 2508.08791 • Published Aug 12 • 16