Hydra: A 1.6B-Parameter State-Space Language Model with Sparse Attention, Mixture-of-Experts, and Memory Paper • 2508.15099 • Published Aug 20 • 1