akhooli commited on
Commit
330ca60
·
verified ·
1 Parent(s): 0c59e0f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -3
README.md CHANGED
@@ -1,3 +1,7 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+ Arabic ModernBERT model partially trained (13% of one epoch)
5
+ on a filtered [subset](https://huggingface.co/datasets/akhooli/afw2_f98_tok) of
6
+ FineWeb2 (text length: 250-25000 characters, 98% or more Arabic words) pretokenized.
7
+ The actual filtered dataset (text column only) is [here](https://huggingface.co/datasets/akhooli/afw2_f98).