the most powerful vision-language model in the Qwen series to date. Available in Dense and MoE architectures
-
Qwen/Qwen3-VL-30B-A3B-Thinking
Image-Text-to-Text • 31B • Updated • 23.2k • • 141 -
mlx-community/Qwen3-VL-30B-A3B-Instruct-4bit
Image-Text-to-Text • Updated • 2.19k • 5 -
mlx-community/Qwen3-VL-30B-A3B-Instruct-8bit
Image-Text-to-Text • Updated • 770 • 2 -
mlx-community/Qwen3-VL-8B-Instruct-4bit
Image-Text-to-Text • Updated • 255 • 2