Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs Paper • 2402.12030 • Published Feb 19, 2024 • 3
When Does Reasoning Matter? A Controlled Study of Reasoning's Contribution to Model Performance Paper • 2509.22193 • Published Sep 26 • 37
view article Article When Does Reasoning Matter? Unpacking the Contribution of Reasoning to LLM Performance By Nicolas-BZRD and 1 other • 30 days ago • 12
Should We Still Pretrain Encoders with Masked Language Modeling? Paper • 2507.00994 • Published Jul 1 • 78
Should We Still Pretrain Encoders with Masked Language Modeling? Paper • 2507.00994 • Published Jul 1 • 78
view article Article Should We Still Pretrain Encoders with Masked Language Modeling? By Nicolas-BZRD and 3 others • Jul 2 • 21
EuroBERT: Scaling Multilingual Encoders for European Languages Paper • 2503.05500 • Published Mar 7 • 79 • 9
EuroBERT: Scaling Multilingual Encoders for European Languages Paper • 2503.05500 • Published Mar 7 • 79
EuroBERT: Scaling Multilingual Encoders for European Languages Paper • 2503.05500 • Published Mar 7 • 79
view article Article Introducing EuroBERT: A High-Performance Multilingual Encoder Model By EuroBERT and 3 others • Mar 10 • 146