English
orionweller commited on
Commit
cd05548
·
verified ·
1 Parent(s): 7c8e17b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -0
README.md ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ ---
6
+
7
+ # Ettin Checkpoints
8
+
9
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
10
+ [![Paper](https://img.shields.io/badge/Paper-Arxiv-red)](https://arxiv.org/abs/2509.06888)
11
+ [![Models](https://img.shields.io/badge/🤗%20Hugging%20Face-12%20Models-blue)](https://huggingface.co/collections/jhu-clsp/mmbert-a-modern-multilingual-encoder-68b725831d7c6e3acc435ed4)
12
+ [![GitHub](https://img.shields.io/badge/GitHub-Code-black)](https://github.com/jhu-clsp/mmBERT)
13
+
14
+ This repository contains the raw training checkpoints for the mmBERT models. Each model contains three subfolders for `decay`, `ext`, and `pretrain`.
15
+
16
+ These files work with Composer and contain all state needed to resume pre-training. Please see the [ModernBERT repository](https://github.com/AnswerDotAI/ModernBERT) for usage details.
17
+
18
+
19
+ ## 🔗 Related Resources
20
+
21
+ - **Models**: [mmBERT Model Suite](https://huggingface.co/collections/jhu-clsp/mmbert-a-modern-multilingual-encoder-68b725831d7c6e3acc435ed4)
22
+ - **Phase 1**: [Pre-training Data](https://huggingface.co/datasets/jhu-clsp/mmbert-pretrain-p1-fineweb2-langs) (2.3T tokens)
23
+ - **Phase 2**: [Mid-training Data](https://huggingface.co/datasets/jhu-clsp/mmbert-midtraining) (600B tokens)
24
+ - **Phase 3**: [Decay Phase Data](https://huggingface.co/datasets/jhu-clsp/mmbert-decay) (100B tokens)
25
+ - **Paper**: [Arxiv link](https://arxiv.org/abs/2509.06888)
26
+ - **Code**: [GitHub Repository](https://github.com/jhu-clsp/mmBERT)
27
+
28
+ ## Citation
29
+
30
+ ```bibtex
31
+ @misc{marone2025mmbertmodernmultilingualencoder,
32
+ title={mmBERT: A Modern Multilingual Encoder with Annealed Language Learning},
33
+ author={Marc Marone and Orion Weller and William Fleshman and Eugene Yang and Dawn Lawrie and Benjamin Van Durme},
34
+ year={2025},
35
+ eprint={2509.06888},
36
+ archivePrefix={arXiv},
37
+ primaryClass={cs.CL},
38
+ url={https://arxiv.org/abs/2509.06888},
39
+ }
40
+ ```