view article Article Australian-made LLM beats OpenAI and Google at legal retrieval By isaacus and 2 others • 8 days ago • 25
view article Article There is no such thing as a tokenizer-free lunch By catherinearnett • Sep 25 • 84
The Majority is not always right: RL training for solution aggregation Paper • 2509.06870 • Published Sep 8 • 16
Reinforcement Learning Finetunes Small Subnetworks in Large Language Models Paper • 2505.11711 • Published May 16 • 11
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18 • 135
view article Article Gotchas in Tokenizer Behavior Every Developer Should Know By qgallouedec • Apr 18 • 44
Gemma 3 Collection All versions of Google's new multimodal models including QAT in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. • 55 items • Updated about 6 hours ago • 89
view article Article Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging By akjindal53244 • Aug 19, 2024 • 78