YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection Paper • 2512.23273 • Published 2 days ago • 11
A 58-Addition, Rank-23 Scheme for General 3x3 Matrix Multiplication Paper • 2512.21980 • Published 5 days ago • 2
LoPA: Scaling dLLM Inference via Lookahead Parallel Decoding Paper • 2512.16229 • Published 13 days ago • 15
CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion Paper • 2512.19535 • Published 9 days ago • 10
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers Paper • 2512.17351 • Published 12 days ago • 24
Fast and Accurate Causal Parallel Decoding using Jacobi Forcing Paper • 2512.14681 • Published 15 days ago • 39
Janus: Disaggregating Attention and Experts for Scalable MoE Inference Paper • 2512.13525 • Published 16 days ago • 5
QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management Paper • 2512.12967 • Published 16 days ago • 103
Error-Free Linear Attention is a Free Lunch: Exact Solution from Continuous-Time Dynamics Paper • 2512.12602 • Published 17 days ago • 40