PixelRefer: A Unified Framework for Spatio-Temporal Object Referring with Arbitrary Granularity Paper • 2510.23603 • Published 23 days ago • 22
OlmoEarth: Stable Latent Image Modeling for Multimodal Earth Observation Paper • 2511.13655 • Published 2 days ago • 9
Back to Basics: Let Denoising Generative Models Denoise Paper • 2511.13720 • Published 2 days ago • 18
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published 6 days ago • 67
Holo2 Collection Holo2 - Cost-Efficient Models for Cross-Platform Computer-Use Agents • 3 items • Updated 6 days ago • 21
ViDoRe Benchmark V3 Collection ViDoRe V3 is our latest benchmark, engineered to set a new industry gold standard for multi-modal, enterprise document retrieval evaluation. • 8 items • Updated 14 days ago • 11
view article Article ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases 14 days ago • 48
Jan-v2-VL Collection Jan-v2-VL: an 8B VLM focused on reliable, many-step task execution. • 6 items • Updated 6 days ago • 27
Hugging Face community’s Wikimedia datasets Collection Wikimedia datasets created by the Hugging Face community, not Wikimedia. Sorted by Wikimedia project. • 17 items • Updated Jun 7, 2024 • 12
Socratic-Zero : Bootstrapping Reasoning via Data-Free Agent Co-evolution Paper • 2509.24726 • Published Sep 29 • 19
Data Splits and Metrics for Method Benchmarking on Surgical Action Triplet Datasets Paper • 2204.05235 • Published Apr 11, 2022 • 1
Fantastic (small) Retrievers and How to Train Them: mxbai-edge-colbert-v0 Tech Report Paper • 2510.14880 • Published Oct 16 • 15
Clara-Medical Collection NVIDIA Clara Open Models for medical imaging AI: segment, generate, and reason across CT, MRI, and X-ray. Built on MONAI by NVIDIA. • 7 items • Updated 1 day ago • 5
NVIDIA Nemotron V2 Collection Open, Production-ready Enterprise Models. Nvidia Open Model license. • 9 items • Updated 1 day ago • 82