MLX

NexaAI/qwen3vl-30B-A3B-mlx

πŸ”§ Quickstart

Run directly with the nexa-sdk CLI:

nexa infer NexaAI/qwen3vl-30B-A3B-mlx

⚠️ Note: You need at least 64 GB of RAM on your Mac to run this model.


🧠 Model Overview

Qwen3-VL-30B-A3B-Instruct is a cutting-edge vision-language model from the Qwen3 series, offering advanced reasoning, spatial perception, long-context understanding, and seamless integration between text and visual data. This model is part of the A3B (Advanced Agent + 3D + Multimodal Boost) instruct-tuned lineup.

πŸ”‘ Key Features

  • Visual Agent Capabilities Understands and interacts with GUIs, software tools, and system elements for agentic task automation.

  • Visual Coding Generation Converts images or video layouts into HTML, CSS, JS, or diagramming tools like Draw.io.

  • Spatial & Temporal Reasoning Handles complex visual spatial tasks (2D/3D object grounding, occlusion) and aligns language with video events.

  • Multimodal Reasoning Excels in STEM, math, and logic tasks with causal, evidence-based answers across text and image/video modalities.

  • 256K+ Context Length Handles ultra-long documents and hours of video input with second-level indexing and full recall.

  • High-Performance OCR Recognizes 32 languages including ancient scripts, scientific notations, and performs well under low-light/blurry conditions.

  • Multilingual & Instruction Following Supports over 100 languages with robust multilingual instruction tuning and translation quality.


πŸ—οΈ Architecture Details

  • Model Type: Vision-Language Causal Transformer

  • Architecture Enhancements:

    • Interleaved-MRoPE: Improved positional embeddings for long-horizon vision tasks.
    • DeepStack: Multi-level ViT feature fusion for fine-grained alignment.
    • Text-Timestamp Alignment: Enhanced video temporal localization.
  • Context Length: Up to 256K tokens (expandable to 1M)

  • Model Size: 30B parameters

  • Architecture: Dense or MoE (Mixture of Experts)

Downloads last month
1,981
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Collection including NexaAI/qwen3vl-30B-A3B-mlx