Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2508.16577

Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

Paper • 2508.09789 • Published Aug 13 • 5
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published Aug 14 • 18
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents

Paper • 2508.04038 • Published Aug 6 • 1
Prompt Orchestration Markup Language

Paper • 2508.13948 • Published Aug 19 • 48

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 34
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 27
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17, 2024 • 22

FlashWorld: High-quality 3D Scene Generation within Seconds

Paper • 2510.13678 • Published 18 days ago • 70
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks

Paper • 2510.15019 • Published 17 days ago • 62
GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction

Paper • 2509.18090 • Published Sep 22 • 3
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation

Paper • 2509.19296 • Published Sep 23 • 22

AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models

Paper • 2506.19851 • Published Jun 24 • 60
SeqTex: Generate Mesh Textures in Video Sequence

Paper • 2507.04285 • Published Jul 6 • 9
Yume: An Interactive World Generation Model

Paper • 2507.17744 • Published Jul 23 • 85
STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer

Paper • 2508.10893 • Published Aug 14 • 31

TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion

Paper • 2401.09416 • Published Jan 17, 2024 • 11
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild

Paper • 2401.10171 • Published Jan 18, 2024 • 14
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model

Paper • 2311.09217 • Published Nov 15, 2023 • 22
GALA: Generating Animatable Layered Assets from a Single Scan

Paper • 2401.12979 • Published Jan 23, 2024 • 9

Describe What You See with Multimodal Large Language Models to Enhance Video Recommendations

Paper • 2508.09789 • Published Aug 13 • 5
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents

Paper • 2508.13186 • Published Aug 14 • 18
ZARA: Zero-shot Motion Time-Series Analysis via Knowledge and Retrieval Driven LLM Agents

Paper • 2508.04038 • Published Aug 6 • 1
Prompt Orchestration Markup Language

Paper • 2508.13948 • Published Aug 19 • 48

AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models

Paper • 2506.19851 • Published Jun 24 • 60
SeqTex: Generate Mesh Textures in Video Sequence

Paper • 2507.04285 • Published Jul 6 • 9
Yume: An Interactive World Generation Model

Paper • 2507.17744 • Published Jul 23 • 85
STream3R: Scalable Sequential 3D Reconstruction with Causal Transformer

Paper • 2508.10893 • Published Aug 14 • 31

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 34
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17, 2024 • 27
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17, 2024 • 22

TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion

Paper • 2401.09416 • Published Jan 17, 2024 • 11
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild

Paper • 2401.10171 • Published Jan 18, 2024 • 14
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model

Paper • 2311.09217 • Published Nov 15, 2023 • 22
GALA: Generating Animatable Layered Assets from a Single Scan

Paper • 2401.12979 • Published Jan 23, 2024 • 9

FlashWorld: High-quality 3D Scene Generation within Seconds

Paper • 2510.13678 • Published 18 days ago • 70
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks

Paper • 2510.15019 • Published 17 days ago • 62
GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction

Paper • 2509.18090 • Published Sep 22 • 3
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation

Paper • 2509.19296 • Published Sep 23 • 22

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs