-
USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning
Paper • 2508.18966 • Published • 56 -
Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing
Paper • 2509.08721 • Published • 673 -
FastVLM: Efficient Vision Encoding for Vision Language Models
Paper • 2412.13303 • Published • 70 -
Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator
Paper • 2411.15466 • Published • 39
Collections
Discover the best community collections!
Collections including paper arxiv:2508.18966
-
Seed-Coder: Let the Code Model Curate Data for Itself
Paper • 2506.03524 • Published • 6 -
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning
Paper • 2504.13914 • Published • 4 -
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Paper • 2503.10772 • Published • 19 -
UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?
Paper • 2503.09949 • Published • 5
-
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
Paper • 2401.09048 • Published • 10 -
Improving fine-grained understanding in image-text pre-training
Paper • 2401.09865 • Published • 18 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper • 2401.10891 • Published • 62 -
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Paper • 2401.13627 • Published • 77
-
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
Paper • 2508.20751 • Published • 89 -
TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling
Paper • 2508.17445 • Published • 80 -
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space
Paper • 2508.19247 • Published • 41 -
VibeVoice Technical Report
Paper • 2508.19205 • Published • 123
-
Contrastive Learning for Many-to-many Multilingual Neural Machine Translation
Paper • 2105.09501 • Published -
Cross-modal Contrastive Learning for Speech Translation
Paper • 2205.02444 • Published -
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Paper • 2210.03052 • Published -
Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning
Paper • 2212.10240 • Published • 1
-
USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning
Paper • 2508.18966 • Published • 56 -
Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing
Paper • 2509.08721 • Published • 673 -
FastVLM: Efficient Vision Encoding for Vision Language Models
Paper • 2412.13303 • Published • 70 -
Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator
Paper • 2411.15466 • Published • 39
-
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
Paper • 2508.20751 • Published • 89 -
TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling
Paper • 2508.17445 • Published • 80 -
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space
Paper • 2508.19247 • Published • 41 -
VibeVoice Technical Report
Paper • 2508.19205 • Published • 123
-
Seed-Coder: Let the Code Model Curate Data for Itself
Paper • 2506.03524 • Published • 6 -
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning
Paper • 2504.13914 • Published • 4 -
FlowTok: Flowing Seamlessly Across Text and Image Tokens
Paper • 2503.10772 • Published • 19 -
UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?
Paper • 2503.09949 • Published • 5
-
Contrastive Learning for Many-to-many Multilingual Neural Machine Translation
Paper • 2105.09501 • Published -
Cross-modal Contrastive Learning for Speech Translation
Paper • 2205.02444 • Published -
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Paper • 2210.03052 • Published -
Diffusion Glancing Transformer for Parallel Sequence to Sequence Learning
Paper • 2212.10240 • Published • 1
-
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
Paper • 2401.09048 • Published • 10 -
Improving fine-grained understanding in image-text pre-training
Paper • 2401.09865 • Published • 18 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper • 2401.10891 • Published • 62 -
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Paper • 2401.13627 • Published • 77