OLMo 2 Collection Artifacts for the OLMo 2 release. • 35 items • Updated about 15 hours ago • 149
MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges? Paper • 2504.09702 • Published Apr 13 • 18
CLASH: Evaluating Language Models on Judging High-Stakes Dilemmas from Multiple Perspectives Paper • 2504.10823 • Published Apr 15 • 15