On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models Paper • 2512.07783 • Published 1 day ago • 20
Quark Quantized PTPC FP8 Models Collection PTPC model quantized by quark • 6 items • Updated 5 days ago
Instella ✨ Collection Announcing Instella, a series of 3 billion parameter language models developed by AMD, trained from scratch on 128 Instinct MI300X GPUs. • 13 items • Updated 5 days ago • 10
Instella: Fully Open Language Models with Stellar Performance Paper • 2511.10628 • Published 26 days ago • 4 • 2