blackcloud1199
/

Qwen2.5-1.5B-Executorch-Q8DA4W

Model card Files Files and versions

Qwen2.5-1.5B-Executorch-Q8DA4W

This repository contains the qwen2_5_1_5b_q8da4w.pte model, exported for use with ExecuTorch.

Details

Base Model: Qwen/Qwen2.5-1.5B-Instruct
Format: .pte (ExecuTorch)
Quantization: Q8DA4W (4-bit linear weights, 8-bit dynamic activations)
Architecture: Qwen2
File Size: ~1.6 GB

Features

🚀 Optimized for mobile/edge devices
📱 Compatible with react-native-executorch
🌍 Excellent multilingual support (including Vietnamese!)
💬 Strong instruction-following capabilities
🧠 Alibaba's Qwen 2.5 is known for exceptional reasoning

Usage

This model is ready to be used in mobile applications (iOS/Android) via the ExecuTorch runtime or react-native-executorch.

Download qwen2_5_1_5b_q8da4w.pte and the tokenizer files (tokenizer.json, vocab.json, merges.txt).
Place them in your app's asset folder.
Load with ExecuTorch runtime.

Notes

Qwen2 uses byte-level BPE tokenizer (similar to GPT-2), not SentencePiece.
Tokenizer files are: tokenizer.json, vocab.json, merges.txt
Vocab size: 151,936 tokens

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for blackcloud1199/Qwen2.5-1.5B-Executorch-Q8DA4W

Base model

Qwen/Qwen2.5-1.5B

Finetuned

Qwen/Qwen2.5-1.5B-Instruct

Finetuned

(1373)

this model