Beyond Turn Limits: Training Deep Search Agents with Dynamic Context Window Paper • 2510.08276 • Published 23 days ago • 9
RefCritic: Training Long Chain-of-Thought Critic Models with Refinement Feedback Paper • 2507.15024 • Published Jul 20 • 14
RefCritic: Training Long Chain-of-Thought Critic Models with Refinement Feedback Paper • 2507.15024 • Published Jul 20 • 14
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 420
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2 • 185
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2 • 185
Towards Better Dynamic Graph Learning: New Architecture and Unified Library Paper • 2303.13047 • Published Mar 23, 2023
Heterogeneous Graph Representation Learning with Relation Awareness Paper • 2105.11122 • Published May 24, 2021
A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models Paper • 2410.13841 • Published Oct 17, 2024 • 17