Generative Universal Verifier as Multimodal Meta-Reasoner Paper • 2510.13804 • Published 11 days ago • 24
SpaceVista: All-Scale Visual Spatial Reasoning from mm to km Paper • 2510.09606 • Published 16 days ago • 17
Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels Paper • 2510.06499 • Published 19 days ago • 31
StreamingVLM: Real-Time Understanding for Infinite Video Streams Paper • 2510.09608 • Published 16 days ago • 48
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation Paper • 2510.08673 • Published 17 days ago • 117
UniVideo: Unified Understanding, Generation, and Editing for Videos Paper • 2510.08377 • Published 17 days ago • 66
SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models Paper • 2510.06917 • Published 18 days ago • 34
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published 19 days ago • 51
VChain: Chain-of-Visual-Thought for Reasoning in Video Generation Paper • 2510.05094 • Published 20 days ago • 35
Paper2Video: Automatic Video Generation from Scientific Papers Paper • 2510.05096 • Published 20 days ago • 106
Video-LMM Post-Training: A Deep Dive into Video Reasoning with Large Multimodal Models Paper • 2510.05034 • Published 20 days ago • 45
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published 27 days ago • 133
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation Paper • 2510.02283 • Published 24 days ago • 91