-
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper • 2312.06134 • Published • 3 -
Efficient Monotonic Multihead Attention
Paper • 2312.04515 • Published • 8 -
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 39 -
Exploring Format Consistency for Instruction Tuning
Paper • 2307.15504 • Published • 8
Collections
Discover the best community collections!
Collections including paper arxiv:2312.17243
-
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 60 -
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Paper • 2312.12456 • Published • 44 -
Cached Transformers: Improving Transformers with Differentiable Memory Cache
Paper • 2312.12742 • Published • 14 -
Mini-GPTs: Efficient Large Language Models through Contextual Pruning
Paper • 2312.12682 • Published • 10
-
Learning Vision from Models Rivals Learning Vision from Data
Paper • 2312.17742 • Published • 16 -
Unsupervised Universal Image Segmentation
Paper • 2312.17243 • Published • 20 -
Perspectives on the State and Future of Deep Learning - 2023
Paper • 2312.09323 • Published • 8 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 14
-
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation
Paper • 2310.16656 • Published • 50 -
Unsupervised Universal Image Segmentation
Paper • 2312.17243 • Published • 20 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 117 -
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks
Paper • 2402.04248 • Published • 32
-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper • 2306.07967 • Published • 25 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper • 2306.07954 • Published • 111 -
TryOnDiffusion: A Tale of Two UNets
Paper • 2306.08276 • Published • 74 -
Seeing the World through Your Eyes
Paper • 2306.09348 • Published • 33
-
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper • 2312.06134 • Published • 3 -
Efficient Monotonic Multihead Attention
Paper • 2312.04515 • Published • 8 -
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 39 -
Exploring Format Consistency for Instruction Tuning
Paper • 2307.15504 • Published • 8
-
Learning Vision from Models Rivals Learning Vision from Data
Paper • 2312.17742 • Published • 16 -
Unsupervised Universal Image Segmentation
Paper • 2312.17243 • Published • 20 -
Perspectives on the State and Future of Deep Learning - 2023
Paper • 2312.09323 • Published • 8 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 14
-
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation
Paper • 2310.16656 • Published • 50 -
Unsupervised Universal Image Segmentation
Paper • 2312.17243 • Published • 20 -
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper • 2402.03620 • Published • 117 -
Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks
Paper • 2402.04248 • Published • 32
-
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 60 -
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Paper • 2312.12456 • Published • 44 -
Cached Transformers: Improving Transformers with Differentiable Memory Cache
Paper • 2312.12742 • Published • 14 -
Mini-GPTs: Efficient Large Language Models through Contextual Pruning
Paper • 2312.12682 • Published • 10
-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper • 2306.07967 • Published • 25 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper • 2306.07954 • Published • 111 -
TryOnDiffusion: A Tale of Two UNets
Paper • 2306.08276 • Published • 74 -
Seeing the World through Your Eyes
Paper • 2306.09348 • Published • 33