Boundary-Guided Policy Optimization for Memory-efficient RL of Diffusion Large Language Models Paper • 2510.11683 • Published Oct 13 • 13
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets? Paper • 2510.02209 • Published Oct 2 • 52
SIRI: Scaling Iterative Reinforcement Learning with Interleaved Compression Paper • 2509.25176 • Published Sep 29 • 13
LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning Paper • 2506.18841 • Published Jun 23 • 56
OpenSAE-LLaMA-3.1-8B Collection OpenSAE checkpoints for LLaMA 3.1 8B base model • 38 items • Updated Jan 29 • 5
MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos Paper • 2506.04141 • Published Jun 4 • 29
Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis Paper • 2506.04142 • Published Jun 4 • 27
From MOOC to MAIC: Reshaping Online Teaching and Learning through LLM-driven Agents Paper • 2409.03512 • Published Sep 5, 2024 • 29