MolPILE -- large-scale, diverse dataset for molecular representation learning Paper • 2509.18353 • Published Sep 22 • 1
ChemPile: A 250GB Diverse and Curated Dataset for Chemical Foundation Models Paper • 2505.12534 • Published May 18 • 3
SFT Doesn't Always Hurt General Capabilities: Revisiting Domain-Specific Fine-Tuning in LLMs Paper • 2509.20758 • Published Sep 25 • 1
Executable Knowledge Graphs for Replicating AI Research Paper • 2510.17795 • Published 11 days ago • 12
EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning Paper • 2510.17928 • Published 12 days ago • 2
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts Paper • 2510.19363 • Published 10 days ago • 59
Emergent Misalignment via In-Context Learning: Narrow in-context examples can produce broadly misaligned LLMs Paper • 2510.11288 • Published 19 days ago • 45
ChemDFM-R: An Chemical Reasoner LLM Enhanced with Atomized Chemical Knowledge Paper • 2507.21990 • Published Jul 29 • 26
Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report Paper • 2508.01059 • Published Aug 1 • 33
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7 • 177
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published Aug 8 • 188
DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization Paper • 2508.14460 • Published Aug 20 • 82
KALE-LM: Unleash The Power Of AI For Science Via Knowledge And Logic Enhanced Large Model Paper • 2409.18695 • Published Sep 27, 2024 • 3