SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models Paper • 2510.09541 • Published 14 days ago • 14
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models Paper • 2502.09604 • Published Feb 13 • 35