Surfer 2: The Next Generation of Cross-Platform Computer Use Agents Paper • 2510.19949 • Published 12 days ago • 34
Emerging Properties in Unified Multimodal Pretraining Paper • 2505.14683 • Published May 20 • 134
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published Apr 7 • 200
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 544
Arbitrary-steps Image Super-resolution via Diffusion Inversion Paper • 2412.09013 • Published Dec 12, 2024 • 13