view article Article Can Your LLM Think Like a Professional? Introducing ProfBench By nvidia and 7 others • 6 days ago • 14
view article Article Can Your LLM Think Like a Professional? Introducing ProfBench By nvidia and 7 others • 6 days ago • 14
BroRL: Scaling Reinforcement Learning via Broadened Exploration Paper • 2510.01180 • Published Oct 1 • 17
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29 • 136
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Paper • 2505.24864 • Published May 30 • 138
StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements Paper • 2408.15666 • Published Aug 28, 2024 • 11
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models Paper • 2406.18510 • Published Jun 26, 2024 • 9
Localized Symbolic Knowledge Distillation for Visual Commonsense Models Paper • 2312.04837 • Published Dec 8, 2023 • 3
The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning Paper • 2312.01552 • Published Dec 4, 2023 • 32
Tailoring Self-Rationalizers with Multi-Reward Distillation Paper • 2311.02805 • Published Nov 6, 2023 • 7
The Generative AI Paradox: "What It Can Create, It May Not Understand" Paper • 2311.00059 • Published Oct 31, 2023 • 20