A$^2$FM: An Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning Paper β’ 2510.12838 β’ Published 11 days ago β’ 22
Scaling Language-Centric Omnimodal Representation Learning Paper β’ 2510.11693 β’ Published 11 days ago β’ 94
OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs Paper β’ 2510.10689 β’ Published 12 days ago β’ 45
ReLook: Vision-Grounded RL with a Multimodal LLM Critic for Agentic Web Coding Paper β’ 2510.11498 β’ Published 11 days ago β’ 10
ACADREASON: Exploring the Limits of Reasoning Models with Academic Research Problems Paper β’ 2510.11652 β’ Published 11 days ago β’ 26
ARES: Multimodal Adaptive Reasoning via Difficulty-Aware Token-Level Entropy Shaping Paper β’ 2510.08457 β’ Published 15 days ago β’ 12
MF-MOS: A Motion-Focused Model for Moving Object Segmentation Paper β’ 2401.17023 β’ Published Jan 30, 2024 β’ 1