MAGA: MAssive Genre-Audience Reformulation to Pretraining Corpus Expansion Paper • 2502.04235 • Published Feb 6 • 23
Memory Retrieval and Consolidation in Large Language Models through Function Tokens Paper • 2510.08203 • Published Oct 9 • 9
Artificial Hippocampus Networks for Efficient Long-Context Modeling Paper • 2510.07318 • Published Oct 8 • 30
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation Paper • 2510.02283 • Published Oct 2 • 95
Generative Universal Verifier as Multimodal Meta-Reasoner Paper • 2510.13804 • Published Oct 15 • 25
Beyond Correctness: Evaluating Subjective Writing Preferences Across Cultures Paper • 2510.14616 • Published Oct 16 • 11
Parallel Loop Transformer for Efficient Test-Time Computation Scaling Paper • 2510.24824 • Published Oct 28 • 16
MME-CC: A Challenging Multi-Modal Evaluation Benchmark of Cognitive Capacity Paper • 2511.03146 • Published Nov 5 • 7
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought Paper • 2511.02779 • Published Nov 4 • 57
Compressed Convolutional Attention: Efficient Attention in a Compressed Latent Space Paper • 2510.04476 • Published Oct 6 • 16
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21 • 400
Deep Ignorance Collection This collection contains the model and data artifacts from O'Brien et al. (2025). https://deepignorance.ai • 43 items • Updated about 1 month ago • 6
H-Net Collection The family of hierarchical networks (H-Nets) from https://arxiv.org/abs/2507.07955 • 8 items • Updated Jul 11 • 20
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale Paper • 2505.03005 • Published May 5 • 36
view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15 • 220