Do LLMs Feel? Teaching Emotion Recognition with Prompts, Retrieval, and Curriculum Learning Paper • 2511.07061 • Published 20 days ago • 3
SofT-GRPO: Surpassing Discrete-Token LLM Reinforcement Learning via Gumbel-Reparameterized Soft-Thinking Policy Optimization Paper • 2511.06411 • Published 21 days ago • 16
Adaptive Multi-Agent Response Refinement in Conversational Systems Paper • 2511.08319 • Published 19 days ago • 40
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B Paper • 2511.06221 • Published 21 days ago • 123
cadrille: Multi-modal CAD Reconstruction with Online Reinforcement Learning Paper • 2505.22914 • Published May 28 • 36
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models Paper • 2506.06395 • Published Jun 5 • 132
view article Article ⚗️ 🔥 Building High-Quality Datasets with distilabel and Prometheus 2 Jun 3, 2024 • 27
The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design Paper • 2408.12503 • Published Aug 22, 2024 • 27