β¨ Ettin 400M for NER
This repository hosts an Ettin 400M model that was fine-tuned on the CoNLL-2003 NER dataset with the awesome Flair libary.
Please notice the following caveats:
- β οΈ To workaround a tokenizer problem in ModernBERT/Ettin, this model was fine-tuned on a forked and modified Ettin 400M model.
- β οΈ At the moment, don't expect "uber" BERT-like performance, more experiments are needed. I am pretty sure that RoPE is causing this.
π Implementation
The model was trained using my ModernBERT experiments repo.
π Performance
A very basic hyper-parameter search is performanced for five different seeds, with reported averaged micro F1-Score on the development set of CoNLL-2003:
| Configuration | Run 1 | Run 2 | Run 3 | Run 4 | Run 5 | Avg. | 
|---|---|---|---|---|---|---|
| bs16-e10-cs0-lr4e-05 | 96 | 96.17 | 96.31 | 96.19 | 96.2 | 96.17 Β± 0.1 | 
| bs16-e10-cs0-lr3e-05 | 96.25 | 96.23 | 96.12 | 96.3 | 95.81 | 96.14 Β± 0.18 | 
| bs16-e10-cs0-lr2e-05 | 96.09 | 96.24 | 95.88 | 96.1 | 96.12 | 96.09 Β± 0.12 | 
| bs16-e10-cs0-lr5e-05 | 95.98 | 95.93 | 96.11 | 96.1 | 96 | 96.02 Β± 0.07 | 
| bs16-e10-cs0-lr1e-05 | 95.77 | 95.8 | 96.14 | 96.01 | 95.84 | 95.91 Β± 0.14 | 
The performance of the current uploaded model is marked in bold.
π£ Usage
The following code can be used to test the model and recognize named entities for a given sentence:
from flair.data import Sentence
from flair.models import SequenceTagger
# Load the model
tagger = SequenceTagger.load("stefan-it/flair-ettin-400m-ner-conll03")
# Define an example sentence
sentence = Sentence("George Washington went to Washington very fast.")
# Now let's predict named entities...
tagger.predict(sentence)
# Print-out the recognized named entities
print("The following named entities are found:")
for entity in sentence.get_spans('ner'):
    print(entity)
This outputs:
Span[0:2]: "George Washington" β PER (1.0000)
Span[4:5]: "Washington" β LOC (1.0000)
	Inference Providers
	NEW
	
	
	This model isn't deployed by any Inference Provider.
	π
			
		Ask for provider support
Model tree for stefan-it/flair-ettin-400m-ner-conll03
Base model
stefan-it/ettin-encoder-400m-tokenizer-fix