BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs Paper • 2504.18415 • Published Apr 25 • 47
mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data Paper • 2502.08468 • Published Feb 12 • 15
Multimodal Latent Language Modeling with Next-Token Diffusion Paper • 2412.08635 • Published Dec 11, 2024 • 48
Data Selection via Optimal Control for Language Models Paper • 2410.07064 • Published Oct 9, 2024 • 9
Self-Boosting Large Language Models with Synthetic Preference Data Paper • 2410.06961 • Published Oct 9, 2024 • 16
Autoregressive Speech Synthesis without Vector Quantization Paper • 2407.08551 • Published Jul 11, 2024 • 17
Direct Preference Knowledge Distillation for Large Language Models Paper • 2406.19774 • Published Jun 28, 2024 • 22
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published Jun 20, 2024 • 95
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers Paper • 2406.05370 • Published Jun 8, 2024 • 19
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Paper • 2405.12130 • Published May 20, 2024 • 50
MathScale: Scaling Instruction Tuning for Mathematical Reasoning Paper • 2403.02884 • Published Mar 5, 2024 • 17