Trace Anything: Representing Any Video in 4D via Trajectory Fields Paper • 2510.13802 • Published 11 days ago • 30
Kling-Avatar: Grounding Multimodal Instructions for Cascaded Long-Duration Avatar Animation Synthesis Paper • 2509.09595 • Published Sep 11 • 48
InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis Paper • 2509.10441 • Published Sep 12 • 30
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning Paper • 2509.08519 • Published Sep 10 • 126
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation Paper • 2509.00428 • Published Aug 30 • 17
Few-step Flow for 3D Generation via Marginal-Data Transport Distillation Paper • 2509.04406 • Published Sep 4 • 11
Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual Try-On and Try-Off Paper • 2508.04825 • Published Aug 6 • 57