PICABench: How Far Are We from Physically Realistic Image Editing? Paper • 2510.17681 • Published 12 days ago • 61
StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs Paper • 2509.22220 • Published Sep 26 • 64
SWE-QA: Can Language Models Answer Repository-level Code Questions? Paper • 2509.14635 • Published Sep 18 • 36
RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation Paper • 2509.16198 • Published Sep 19 • 126
MachineLearningLM: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML Paper • 2509.06806 • Published Sep 8 • 63
Sherlock: Self-Correcting Reasoning in Vision-Language Models Paper • 2505.22651 • Published May 28 • 50
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data Paper • 2505.18445 • Published May 24 • 64
PixelHacker: Image Inpainting with Structural and Semantic Consistency Paper • 2504.20438 • Published Apr 29 • 44
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13 • 193
Lifelong Sequential Knowledge Editing without Model Degradation Paper • 2502.01636 • Published Feb 3 • 5
Language Models Prefer What They Know: Relative Confidence Estimation via Confidence Preferences Paper • 2502.01126 • Published Feb 3 • 4
Federated Sketching LoRA: On-Device Collaborative Fine-Tuning of Large Language Models Paper • 2501.19389 • Published Jan 31 • 4
Activation Approximations Can Incur Safety Vulnerabilities Even in Aligned LLMs: Comprehensive Analysis and Defense Paper • 2502.00840 • Published Feb 2 • 1
COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and Generation Paper • 2502.02589 • Published Feb 4 • 10
Generating Multi-Image Synthetic Data for Text-to-Image Customization Paper • 2502.01720 • Published Feb 3 • 8
Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification Paper • 2502.01839 • Published Feb 3 • 11
Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial? Paper • 2502.00674 • Published Feb 2 • 13