📌 Overview

A 4-bit MLX quantized version of Qwen3-30B—A6B optimized for efficient inference using the MLX library, designed to handle long-context tasks (192k tokens) with reduced resource usage. Retains core capabilities of Qwen3 while enabling deployment on edge devices.

Downloads last month: 39

Safetensors

Model size

5B params

Tensor type

BF16

U32

Model tree for Goraint/Qwen3-30B-A6B-16-Extreme-128k-context-MLX-RTN-4bit

Base model

DavidAU/Qwen3-30B-A6B-16-Extreme-128k-context

Quantized

(2)

this model