Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning Paper • 2506.10521 • Published Jun 12 • 73
RULER-Bench: Probing Rule-based Reasoning Abilities of Next-level Video Generation Models for Vision Foundation Intelligence Paper • 2512.02622 • Published 8 days ago • 9
RadarQA: Multi-modal Quality Analysis of Weather Radar Forecasts Paper • 2508.12291 • Published Aug 17