๐Ÿ‡ฎ๐Ÿ‡ณ NLLB-200-Distilled-600M (English โ†” Mizo) โ€” QLoRA Fine-tune

Author: flt007โ€ƒโ€ƒBase: facebook/nllb-200-distilled-600M

Low-resource English โ†” Mizo translation model fine-tuned with QLoRA.


๐Ÿง  Overview

A lightweight NLLB derivative trained on bilingual Mizoโ€“English text to improve coverage for the under-represented Mizo language (lus_Latn).


โš™๏ธ Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained('flt007/mbart-mizo-merged')
tok = AutoTokenizer.from_pretrained('flt007/mbart-mizo-merged')

text = 'We must protect our forests.'
inp = tok(text, return_tensors='pt')
out = model.generate(
    **inp,
    forced_bos_token_id=tok.convert_tokens_to_ids('lus_Latn'),
    max_new_tokens=50
)
print(tok.decode(out[0], skip_special_tokens=True))

๐Ÿ— Training Details

  • Method: QLoRA (8-bit LoRA adapters)
  • Dataset: Prototype (4 bilingual pairs)
  • Epochs: 1
  • Hardware: Google Colab T4 GPU (16 GB VRAM)

๐Ÿ”ฎ Next Steps

  • Expand dataset to >10 000 pairs
  • Multi-epoch training & BLEU/ChrF evaluation
  • Bidirectional English โ†” Mizo model release

๐Ÿ“œ License

Released under CC-BY-NC 4.0


โค๏ธ Acknowledgments

  • Meta AI (NLLB)
  • Hugging Face
  • Frankie Thiak
Downloads last month
6
Safetensors
Model size
0.6B params
Tensor type
F32
ยท
F16
ยท
I8
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for flt7007/mbart-mizo-merged

Quantized
(6)
this model