olmOCR-2-7B-1025-MLX-8bit

This is an 8-bit quantized version of allenai/olmOCR-2-7B-1025 optimized for Apple Silicon using MLX.

Model Description

olmOCR-2 is a state-of-the-art OCR (Optical Character Recognition) vision-language model fine-tuned from Qwen2.5-VL-7B-Instruct. This 8-bit quantized version provides excellent quality with significantly reduced memory footprint.

Base Model: allenai/olmOCR-2-7B-1025
Quantization: 8-bit using MLX
Model Size: 8.4 GB (down from ~14 GB BF16)
Size Reduction: ~40%

Performance

olmOCR-2 achieves 82.4 points on olmOCR-Bench, representing state-of-the-art performance for real-world OCR of English-language digitized print documents. The model has been additionally fine-tuned using GRPO RL training to boost performance on:

Math equations
Tables
Complex layouts
Handwriting

Usage

Requirements

pip install mlx-vlm

Basic Usage

from mlx_vlm import load, generate
from PIL import Image

# Load the model
model, processor = load("richardyoung/olmOCR-2-7B-1025-MLX-8bit")

# Load your image
image = Image.open("document.png")

# Extract text
prompt = "Extract all text from this image."
output = generate(model, processor, image, prompt, max_tokens=2048)
print(output)

Command Line

python -m mlx_vlm.generate \
  --model richardyoung/olmOCR-2-7B-1025-MLX-8bit \
  --image document.png \
  --prompt "Extract all text from this image." \
  --max-tokens 2048

Quantization Details

Method: MLX native quantization
Bits: 8-bit
Group Size: Default
Recommended for: Users who prioritize quality and have sufficient RAM (10GB+)

Model Variants

Variant	Size	Precision	Use Case
8-bit	8.4 GB	Highest	Best quality, more RAM
6-bit	6.4 GB	High	Balanced quality/size
4-bit	4.5 GB	Good	Smallest size, less RAM

System Requirements

Platform: Apple Silicon (M1/M2/M3/M4)
RAM: 10+ GB recommended
OS: macOS 12.0+

Limitations

Optimized primarily for English-language printed documents
May have reduced performance on handwritten text compared to printed text
Requires Apple Silicon hardware for optimal performance

Citation

@article{olmocr2,
  title={olmOCR 2: Unit test rewards for document OCR},
  author={Allen Institute for AI},
  year={2025}
}

License

Apache 2.0 (inherited from base model)

Acknowledgements

Base model by Allen Institute for AI
Quantized for MLX by richardyoung
Built with MLX-VLM

Generated with Claude Code

Downloads last month: 77

Safetensors

Model size

2B params

Tensor type

F16

U32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for richardyoung/olmOCR-2-7B-1025-MLX-8bit

Base model

Qwen/Qwen2.5-VL-7B-Instruct

Finetuned

allenai/olmOCR-2-7B-1025

Finetuned

(3)

this model