Alchemist: Unlocking Efficiency in Text-to-Image Model Training via Meta-Gradient Data Selection Paper • 2512.16905 • Published 7 days ago • 30
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows Paper • 2510.24411 • Published Oct 28 • 71
JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence Paper • 2510.23538 • Published Oct 27 • 96
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper • 2510.23607 • Published Oct 27 • 176
PhysMaster: Mastering Physical Representation for Video Generation via Reinforcement Learning Paper • 2510.13809 • Published Oct 15 • 37
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing Paper • 2503.10639 • Published Mar 13 • 53