BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining Paper • 2508.10975 • Published Aug 14 • 60
Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition Paper • 2309.15223 • Published Sep 26, 2023 • 22
Chart-RVR Collection Models trained using GRPO for enhanced Chart Reasoning • 3 items • Updated Aug 24 • 1
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research Paper • 2402.00159 • Published Jan 31, 2024 • 65
view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment Feb 11 • 89
Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought" Control Paper • 2504.17130 • Published Apr 23 • 1
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 250
LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset Paper • 2402.09391 • Published Feb 14, 2024 • 2