How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks Paper • 2507.01955 • Published Jul 2, 2025 • 35
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 Jul 5, 2024 • 306
view article Article Reinforcement Learning for Large Language Models: Beyond the Agent Paradigm Mar 19, 2025 • 8