TTRV: Test-Time Reinforcement Learning for Vision Language Models Paper • 2510.06783 • Published Oct 8 • 11
VisualOverload: Probing Visual Understanding of VLMs in Really Dense Scenes Paper • 2509.25339 • Published Sep 29 • 9
LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content Paper • 2410.10783 • Published Oct 14, 2024 • 27