view article Article Activation Steering: A New Frontier in AI ControlβBut Does It Scale? By royswastik β’ Feb 2 β’ 3
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr β’ Feb 7 β’ 238
100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models Paper β’ 2505.00551 β’ Published May 1 β’ 36
LLMs for Engineering: Teaching Models to Design High Powered Rockets Paper β’ 2504.19394 β’ Published Apr 27 β’ 14
AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization Paper β’ 2504.21659 β’ Published Apr 30 β’ 14
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers Paper β’ 2504.20752 β’ Published Apr 29 β’ 92
EmbodiedEval: Evaluate Multimodal LLMs as Embodied Agents Paper β’ 2501.11858 β’ Published Jan 21 β’ 7
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper β’ 2501.05441 β’ Published Jan 9 β’ 95
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives Paper β’ 2501.04003 β’ Published Jan 7 β’ 27
VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models Paper β’ 2412.01822 β’ Published Dec 2, 2024 β’ 15
DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving Paper β’ 2411.15139 β’ Published Nov 22, 2024 β’ 15
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper β’ 2411.10440 β’ Published Nov 15, 2024 β’ 129
view article Article LLaVA-o1: Let Vision Language Models Reason Step-by-Step By mikelabs β’ Nov 19, 2024 β’ 12