Training Dynamics Underlying Language Model Scaling Laws: Loss Deceleration and Zero-Sum Learning Paper • 2506.05447 • Published Jun 5 • 1