Qwen/Qwen3-VL-30B-A3B-Instruct-FP8 Image-Text-to-Text • 31B • Updated 20 days ago • 131k • 91
Qwen/Qwen3-VL-30B-A3B-Instruct Image-Text-to-Text • 31B • Updated 20 days ago • 1.35M • • 443
Qwen/Qwen3-VL-30B-A3B-Thinking Image-Text-to-Text • 31B • Updated 20 days ago • 53.8k • • 164
Qwen/Qwen3-VL-235B-A22B-Instruct-FP8 Image-Text-to-Text • 236B • Updated 20 days ago • 311k • 32
Qwen/Qwen3-VL-235B-A22B-Thinking-FP8 Image-Text-to-Text • 236B • Updated 20 days ago • 8.3k • 24
Qwen/Qwen3-VL-235B-A22B-Instruct Image-Text-to-Text • 236B • Updated 20 days ago • 146k • • 334
Qwen/Qwen3-VL-235B-A22B-Thinking Image-Text-to-Text • 236B • Updated 20 days ago • 6.42k • • 344
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published Jun 8 • 114
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14 • 303
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs Paper • 2406.07476 • Published Jun 11, 2024 • 37
LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization Paper • 2502.13922 • Published Feb 19 • 28