Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning Paper • 2510.04081 • Published Oct 5 • 23
When Thinking Backfires: Mechanistic Insights Into Reasoning-Induced Misalignment Paper • 2509.00544 • Published Aug 30 • 11
Awesome SFT datasets Collection A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12, 2024 • 145
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs Paper • 2412.21187 • Published Dec 30, 2024 • 40
🔍 Interpretability & Analysis of LMs Collection Outstanding research in LM interpretability and evaluation, summarized • 134 items • Updated Oct 20 • 116
view article Article Extracting Concepts from LLMs: Anthropic’s recent discoveries 📖 Jun 20, 2024 • 26