Susant-Achary commited on
Commit
c86f5ea
·
verified ·
1 Parent(s): d1d9af2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -0
README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - mlx
7
+ - apple-silicon
8
+ - multimodal
9
+ - vision-language
10
+ - pixtral
11
+ - llava
12
+ - quantized
13
+ - 3bit
14
+ - 4bit
15
+ - 5bit
16
+ - 6bit
17
+ pipeline_tag: image-text-to-text
18
+ library_name: mlx
19
+ ---
20
+
21
+ # Apriel-1.5-15B-Thinker — **MLX 3-bit** (Apple Silicon)
22
+
23
+ **Format:** MLX (Mac, Apple Silicon)
24
+ **Quantization:** **3-bit** (balanced footprint ↔ quality)
25
+ **Base:** ServiceNow-AI/Apriel-1.5-15B-Thinker
26
+ **Architecture:** Pixtral-style LLaVA (vision encoder → 2-layer projector → decoder)
27
+
28
+ This repository provides a **3-bit MLX** build of Apriel-1.5-15B-Thinker for **on-device** multimodal inference on Apple Silicon. In side-by-side tests, the **3-bit** variant often:
29
+ - uses **significantly less RAM** than 6-bit,
30
+ - decodes **faster**, and
31
+ - tends to produce **more direct answers** (less “thinking out loud”) at low temperature.
32
+
33
+ If RAM allows, we also suggest trying **4-bit/5-bit/6-bit** variants (guidance below) for tasks that demand more fidelity.
34
+
35
+ > Explore other Apriel MLX variants under the `mlx-community` namespace on the Hub.
36
+
37
+ ---
38
+
39
+ ## 🔎 Upstream → MLX summary
40
+
41
+ Apriel-1.5-15B-Thinker is a multimodal reasoning VLM built via **depth upscaling**, **two-stage multimodal continual pretraining**, and **SFT with explicit reasoning traces** (math, coding, science, tool-use).
42
+ This MLX release converts the upstream checkpoint with **3-bit** quantization for smaller memory and quick startup on macOS.
43
+
44
+ ---
45
+
46
+ ## 📦 Contents
47
+
48
+ - `config.json` (MLX config for Pixtral-style VLM)
49
+ - `mlx_model*.safetensors` (3-bit shards)
50
+ - `tokenizer.json`, `tokenizer_config.json`
51
+ - `processor_config.json` / `image_processor.json`
52
+ - `model_index.json` and metadata
53
+
54
+ ---
55
+
56
+ ## 🚀 Quickstart (CLI)
57
+
58
+ **Single image caption**
59
+ ```bash
60
+ python -m mlx_vlm.generate \
61
+ --model <this-repo-id> \
62
+ --image /path/to/image.jpg \
63
+ --prompt "Describe this image in two concise sentences." \
64
+ --max-tokens 128 --temperature 0.0 --device mps --seed 0