MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency Paper • 2510.25897 • Published 4 days ago • 13
EHR-R1: A Reasoning-Enhanced Foundational Language Model for Electronic Health Record Analysis Paper • 2510.25628 • Published 4 days ago • 9
OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes Paper • 2510.26800 • Published 3 days ago • 17
Uniform Discrete Diffusion with Metric Path for Video Generation Paper • 2510.24717 • Published 5 days ago • 39
Lookahead Anchoring: Preserving Character Identity in Audio-Driven Human Animation Paper • 2510.23581 • Published 6 days ago • 41
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper • 2510.23607 • Published 6 days ago • 167
RAPO++: Cross-Stage Prompt Optimization for Text-to-Video Generation via Data Alignment and Test-Time Scaling Paper • 2510.20206 • Published 11 days ago • 11
Video-As-Prompt: Unified Semantic Control for Video Generation Paper • 2510.20888 • Published 10 days ago • 41
LayerComposer: Interactive Personalized T2I via Spatially-Aware Layered Canvas Paper • 2510.20820 • Published 10 days ago • 8
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives Paper • 2510.20822 • Published 10 days ago • 38
Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence Paper • 2510.20579 • Published 10 days ago • 52
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing Paper • 2510.19808 • Published 11 days ago • 27
GigaBrain-0: A World Model-Powered Vision-Language-Action Model Paper • 2510.19430 • Published 11 days ago • 43
LightMem: Lightweight and Efficient Memory-Augmented Generation Paper • 2510.18866 • Published 12 days ago • 106
UltraGen: High-Resolution Video Generation with Hierarchical Attention Paper • 2510.18775 • Published 12 days ago • 16