🇮🇳 NLLB-200-Distilled-600M (English ↔ Mizo) — QLoRA Fine-tune

Author: flt007 Base: facebook/nllb-200-distilled-600M

Low-resource English ↔ Mizo translation model fine-tuned with QLoRA.

🧠 Overview

A lightweight NLLB derivative trained on bilingual Mizo–English text to improve coverage for the under-represented Mizo language (lus_Latn).

⚙️ Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained('flt007/mbart-mizo-merged')
tok = AutoTokenizer.from_pretrained('flt007/mbart-mizo-merged')

text = 'We must protect our forests.'
inp = tok(text, return_tensors='pt')
out = model.generate(
    **inp,
    forced_bos_token_id=tok.convert_tokens_to_ids('lus_Latn'),
    max_new_tokens=50
)
print(tok.decode(out[0], skip_special_tokens=True))

🏗 Training Details

Method: QLoRA (8-bit LoRA adapters)
Dataset: Prototype (4 bilingual pairs)
Epochs: 1
Hardware: Google Colab T4 GPU (16 GB VRAM)

🔮 Next Steps

Expand dataset to >10 000 pairs
Multi-epoch training & BLEU/ChrF evaluation
Bidirectional English ↔ Mizo model release

📜 License

Released under CC-BY-NC 4.0

❤️ Acknowledgments

Meta AI (NLLB)
Hugging Face
Frankie Thiak

Downloads last month: 6

Safetensors

Model size

0.6B params

Tensor type

F32

F16

Model tree for flt7007/mbart-mizo-merged

Base model

facebook/nllb-200-distilled-600M

Quantized

(6)

this model