olmOCR-2-7B-1025-MLX-8bit

This is an 8-bit quantized version of allenai/olmOCR-2-7B-1025 optimized for Apple Silicon using MLX.

Model Description

olmOCR-2 is a state-of-the-art OCR (Optical Character Recognition) vision-language model fine-tuned from Qwen2.5-VL-7B-Instruct. This 8-bit quantized version provides excellent quality with significantly reduced memory footprint.

Base Model: allenai/olmOCR-2-7B-1025
Quantization: 8-bit using MLX
Model Size: 8.4 GB (down from ~14 GB BF16)
Size Reduction: ~40%

Performance

olmOCR-2 achieves 82.4 points on olmOCR-Bench, representing state-of-the-art performance for real-world OCR of English-language digitized print documents. The model has been additionally fine-tuned using GRPO RL training to boost performance on:

  • Math equations
  • Tables
  • Complex layouts
  • Handwriting

Usage

Requirements

pip install mlx-vlm

Basic Usage

from mlx_vlm import load, generate
from PIL import Image

# Load the model
model, processor = load("richardyoung/olmOCR-2-7B-1025-MLX-8bit")

# Load your image
image = Image.open("document.png")

# Extract text
prompt = "Extract all text from this image."
output = generate(model, processor, image, prompt, max_tokens=2048)
print(output)

Command Line

python -m mlx_vlm.generate \
  --model richardyoung/olmOCR-2-7B-1025-MLX-8bit \
  --image document.png \
  --prompt "Extract all text from this image." \
  --max-tokens 2048

Quantization Details

  • Method: MLX native quantization
  • Bits: 8-bit
  • Group Size: Default
  • Recommended for: Users who prioritize quality and have sufficient RAM (10GB+)

Model Variants

Variant Size Precision Use Case
8-bit 8.4 GB Highest Best quality, more RAM
6-bit 6.4 GB High Balanced quality/size
4-bit 4.5 GB Good Smallest size, less RAM

System Requirements

  • Platform: Apple Silicon (M1/M2/M3/M4)
  • RAM: 10+ GB recommended
  • OS: macOS 12.0+

Limitations

  • Optimized primarily for English-language printed documents
  • May have reduced performance on handwritten text compared to printed text
  • Requires Apple Silicon hardware for optimal performance

Citation

@article{olmocr2,
  title={olmOCR 2: Unit test rewards for document OCR},
  author={Allen Institute for AI},
  year={2025}
}

License

Apache 2.0 (inherited from base model)

Acknowledgements


Generated with Claude Code

Downloads last month
77
Safetensors
Model size
2B params
Tensor type
F16
·
U32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for richardyoung/olmOCR-2-7B-1025-MLX-8bit

Finetuned
(3)
this model