Self-Rewarding Vision-Language Model via Reasoning Decomposition Paper • 2508.19652 • Published Aug 27 • 84
V-JEPA 2 Collection A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13 • 167
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model Paper • 2504.07615 • Published Apr 10 • 33
Collaborative Instance Navigation: Leveraging Agent Self-Dialogue to Minimize User Input Paper • 2412.01250 • Published Dec 2, 2024 • 5