jhu-clsp
/

mmBERT-checkpoints

Model card Files Files and versions

orionweller commited on Sep 9

Commit

cd05548

·

verified ·

1 Parent(s): 7c8e17b

Create README.md

Files changed (1) hide show

README.md +40 -0

README.md ADDED Viewed

	@@ -0,0 +1,40 @@

+---
+license: mit
+language:
+- en
+---
+# Ettin Checkpoints
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
+[![Paper](https://img.shields.io/badge/Paper-Arxiv-red)](https://arxiv.org/abs/2509.06888)
+[![Models](https://img.shields.io/badge/🤗%20Hugging%20Face-12%20Models-blue)](https://huggingface.co/collections/jhu-clsp/mmbert-a-modern-multilingual-encoder-68b725831d7c6e3acc435ed4)
+[![GitHub](https://img.shields.io/badge/GitHub-Code-black)](https://github.com/jhu-clsp/mmBERT)
+This repository contains the raw training checkpoints for the mmBERT models.  Each model contains three subfolders for `decay`, `ext`, and `pretrain`.
+These files work with Composer and contain all state needed to resume pre-training. Please see the [ModernBERT repository](https://github.com/AnswerDotAI/ModernBERT) for usage details.
+## 🔗 Related Resources
+- **Models**: [mmBERT Model Suite](https://huggingface.co/collections/jhu-clsp/mmbert-a-modern-multilingual-encoder-68b725831d7c6e3acc435ed4)
+- **Phase 1**: [Pre-training Data](https://huggingface.co/datasets/jhu-clsp/mmbert-pretrain-p1-fineweb2-langs) (2.3T tokens)
+- **Phase 2**: [Mid-training Data](https://huggingface.co/datasets/jhu-clsp/mmbert-midtraining) (600B tokens)
+- **Phase 3**: [Decay Phase Data](https://huggingface.co/datasets/jhu-clsp/mmbert-decay) (100B tokens)
+- **Paper**: [Arxiv link](https://arxiv.org/abs/2509.06888)
+- **Code**: [GitHub Repository](https://github.com/jhu-clsp/mmBERT)
+## Citation
+```bibtex
+@misc{marone2025mmbertmodernmultilingualencoder,
+      title={mmBERT: A Modern Multilingual Encoder with Annealed Language Learning},
+      author={Marc Marone and Orion Weller and William Fleshman and Eugene Yang and Dawn Lawrie and Benjamin Van Durme},
+      year={2025},
+      eprint={2509.06888},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2509.06888},
+}
+```