-
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning
Paper • 2506.07044 • Published • 113 -
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning
Paper • 2506.09513 • Published • 98 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
Seedance 1.0: Exploring the Boundaries of Video Generation Models
Paper • 2506.09113 • Published • 102
Collections
Discover the best community collections!
Collections including paper arxiv:2506.18945
-
CoRAG: Collaborative Retrieval-Augmented Generation
Paper • 2504.01883 • Published • 9 -
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Paper • 2504.08837 • Published • 43 -
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
Paper • 2504.10068 • Published • 30 -
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
Paper • 2504.10481 • Published • 84
-
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 122 -
Process-Supervised Reinforcement Learning for Code Generation
Paper • 2502.01715 • Published
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 11.2k • 1.21k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 448 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 62
-
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Paper • 2504.06261 • Published • 110 -
InteractVLM: 3D Interaction Reasoning from 2D Foundational Models
Paper • 2504.05303 • Published • 5 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 120 -
Hybrid 3D-4D Gaussian Splatting for Fast Dynamic Scene Representation
Paper • 2505.13215 • Published • 29
-
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning
Paper • 2506.07044 • Published • 113 -
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning
Paper • 2506.09513 • Published • 98 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
Seedance 1.0: Exploring the Boundaries of Video Generation Models
Paper • 2506.09113 • Published • 102
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 11.2k • 1.21k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 448 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 62
-
CoRAG: Collaborative Retrieval-Augmented Generation
Paper • 2504.01883 • Published • 9 -
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Paper • 2504.08837 • Published • 43 -
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
Paper • 2504.10068 • Published • 30 -
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
Paper • 2504.10481 • Published • 84
-
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention
Paper • 2504.06261 • Published • 110 -
InteractVLM: 3D Interaction Reasoning from 2D Foundational Models
Paper • 2504.05303 • Published • 5 -
TTRL: Test-Time Reinforcement Learning
Paper • 2504.16084 • Published • 120 -
Hybrid 3D-4D Gaussian Splatting for Fast Dynamic Scene Representation
Paper • 2505.13215 • Published • 29
-
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 122 -
Process-Supervised Reinforcement Learning for Code Generation
Paper • 2502.01715 • Published