Model Card: ModernAraBERT

Summary

  • Arabic encoder adapted from answerdotai/ModernBERT-base via continued pretraining on Arabic corpora (~9.8GB).
  • Strong results across SA, NER (Macro-F1), and QA EM vs. AraBERT/mBERT/MARBERT.
  • License: MIT Β· Paper: LREC 2026 Β· Hub: gizadatateam/ModernAraBERT

Intended Uses

  • Masked LM, feature extraction, and transfer learning for Arabic tasks.
  • Downstream: sentiment analysis, NER, extractive QA, general classification/labeling.

How to use

from transformers import AutoTokenizer, AutoModelForMaskedLM
name = "gizadatateam/ModernAraBERT"
model = AutoModelForMaskedLM.from_pretrained(name)
tokenizer = AutoTokenizer.from_pretrained(name)

Training data and recipe (brief)

  • Corpora: OSIAN, Arabic Billion Words, Arabic Wikipedia, OSCAR Arabic
  • Tokenizer: ModernBERT vocab + 80K Arabic tokens
  • Objective: MLM (3 epochs; 128β†’512 seq len)
  • Hardware: A100 40GB; framework: PyTorch + Transformers + Accelerate

Evaluation (from paper)

Sentiment Analysis β€” Macro-F1 (%)

Model LABR HARD AJGT
AraBERTv1 45.35 72.65 58.01
AraBERTv2 45.79 67.10 53.59
mBERT 44.18 71.70 61.55
MARBERT 45.54 67.39 60.63
ModernAraBERT 56.45 89.37 70.54

NER β€” Macro-F1 (%)

Model Macro-F1
AraBERTv1 13.46
AraBERTv2 16.77
mBERT 12.15
MARBERT 7.42
ModernAraBERT 28.23

QA (ARCD test) β€” EM (%)

Model EM
AraBERT 25.36
AraBERTv2 26.08
mBERT 25.12
MARBERT 23.58
ModernAraBERT 27.10

Citation

@inproceedings{<paper_id>,
  title={Efficient Adaptation of English Language Models for Low-Resource and Morphologically Rich Languages: The Case of Arabic},
  author={Maher, Eldamaty, Ashraf, ElShawi, Mostafa},
  booktitle={Proceedings of <conference_name>},
  year={2025},
  organization={<conference_name>}
}
Downloads last month
43
Safetensors
Model size
0.2B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for gizadatateam/ModernAraBERT

Finetuned
(822)
this model

Datasets used to train gizadatateam/ModernAraBERT