FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper • 2506.20920 • Published Jun 26 • 73
ComposeAnything: Composite Object Priors for Text-to-Image Generation Paper • 2505.24086 • Published May 30 • 5
ComposeAnything: Composite Object Priors for Text-to-Image Generation Paper • 2505.24086 • Published May 30 • 5 • 3
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation Paper • 2504.08736 • Published Apr 11 • 46
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published Apr 11 • 130
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper • 2501.03895 • Published Jan 7 • 52