Adaptive Preference Optimization with Uncertainty-aware Utility Anchor Paper • 2509.10515 • Published Sep 3, 2025
UltraVoice: Scaling Fine-Grained Style-Controlled Speech Conversations for Spoken Dialogue Models Paper • 2510.22588 • Published Oct 26, 2025 • 1
IMTalker: Efficient Audio-driven Talking Face Generation with Implicit Motion Transfer Paper • 2511.22167 • Published Nov 27, 2025
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published 26 days ago • 74
Make an Offer They Can't Refuse: Grounding Bayesian Persuasion in Real-World Dialogues without Pre-Commitment Paper • 2510.13387 • Published Oct 15, 2025
Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia Paper • 2512.03318 • Published Dec 3, 2025
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published 26 days ago • 74
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published 26 days ago • 74 • 4
Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs Paper • 2508.19594 • Published Aug 27, 2025 • 2
RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling Paper • 2506.08672 • Published Jun 10, 2025 • 30
RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling Paper • 2506.08672 • Published Jun 10, 2025 • 30 • 3
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space Paper • 2505.13308 • Published May 19, 2025 • 27 • 4
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space Paper • 2505.13308 • Published May 19, 2025 • 27