Agent Lightning: Train ANY AI Agents with Reinforcement Learning Paper • 2508.03680 • Published Aug 5 • 72
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 236
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models Paper • 2507.13344 • Published Jul 17 • 56
A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published Jul 17 • 257
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization Paper • 2507.14683 • Published Jul 19 • 131
Beyond Context Limits: Subconscious Threads for Long-Horizon Reasoning Paper • 2507.16784 • Published Jul 22 • 120
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels Paper • 2507.21809 • Published Jul 29 • 131
ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents Paper • 2507.22827 • Published Jul 30 • 98
Phi-Ground Tech Report: Advancing Perception in GUI Grounding Paper • 2507.23779 • Published Jul 31 • 44