gemma-4-26B-A4B-it-F32-GGUF

Gemma-4-26B-A4B-it from Google is a 26B total parameter Mixture-of-Experts (MoE) multimodal model with only 3.8B-4B active parameters per forward pass (8 active + 1 shared expert from 128 total), delivering near-equivalent quality to the dense 31B sibling at dramatically lower compute/memory cost while supporting 256K context length, 1024-token sliding window, text+image modalities (variable aspect ratio/resolution), and advanced agentic capabilities. Featuring 30 layers and 262K vocabulary across 140+ languages, the instruction-tuned variant excels at reasoning (configurable thinking modes), coding, OCR/handwriting recognition, document parsing, UI analysis, chart comprehension, and object detection with pointing—optimized for high-throughput server/workstation deployment on NVIDIA/AMD GPUs via vLLM/llama.cpp with Apache 2.0 licensing. Positioned between edge-focused E2B/E4B and flagship 31B models in the Gemma 4 family, it balances frontier-level multimodal intelligence with production-scale efficiency for enterprise agents, function calling, and structured data workflows.

Quick start with llama.cpp

llama-server -hf prithivMLmods/gemma-4-26B-A4B-it-F32-GGUF:F32

Model Files

File Name	Quant Type	File Size	File Link
gemma-4-26B-A4B-it.BF16.gguf	BF16	50.5 GB	Download
gemma-4-26B-A4B-it.F16.gguf	F16	50.5 GB	Download
gemma-4-26B-A4B-it.F32.gguf	F32	101 GB	Download
gemma-4-26B-A4B-it.Q8_0.gguf	Q8_0	26.9 GB	Download
gemma-4-26B-A4B-it.mmproj-bf16.gguf	mmproj-bf16	1.19 GB	Download
gemma-4-26B-A4B-it.mmproj-f16.gguf	mmproj-f16	1.19 GB	Download
gemma-4-26B-A4B-it.mmproj-f32.gguf	mmproj-f32	2.29 GB	Download
gemma-4-26B-A4B-it.mmproj-q8_0.gguf	mmproj-q8_0	806 MB	Download

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

Downloads last month: 11,899

GGUF

Model size

25B params

Architecture

gemma4

Hardware compatibility

8-bit

16-bit

32-bit

Model tree for prithivMLmods/gemma-4-26B-A4B-it-F32-GGUF

Base model

google/gemma-4-26B-A4B-it

Quantized

(151)

this model

Collection including prithivMLmods/gemma-4-26B-A4B-it-F32-GGUF

Gemma-4 F32 GGUF

Collection

Collection of Gemma-4 Quants • 4 items • Updated 18 days ago • 2