Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 19 items • Updated 9 days ago • 69
mmBERT: a modern multilingual encoder Collection mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance • 16 items • Updated Sep 9 • 49
Contextual AI Reranker v2 Collection Family of instruction-following multilingual rerankers on the cost/performance Pareto frontier across public and customer benchmarks • 6 items • Updated Aug 25 • 9
Pangea Collection A Fully Open Multilingual Multimodal LLM for 39 Languages • 26 items • Updated Feb 1 • 19
DataGemma Release Collection A series of pioneering open models that help ground LLMs in real-world data through Data Commons. • 2 items • Updated Jul 10 • 87
VideoLLaMA2 Collection Optimized VideoLLaMA with improved spatial-temporal modeling and better audio understanding capability • 13 items • Updated Sep 2 • 19
view article Article Extracting Concepts from LLMs: Anthropic’s recent discoveries 📖 Jun 20, 2024 • 26
Cohere Labs Aya 23 Collection Aya 23 is an open weights research release of an instruction fine-tuned model with highly advanced multilingual capabilities. • 3 items • Updated Jul 31 • 56
Reducing Transformer Key-Value Cache Size with Cross-Layer Attention Paper • 2405.12981 • Published May 21, 2024 • 33
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models Paper • 2404.07973 • Published Apr 11, 2024 • 32
StarChat2 15B Collection Model, datasets, and demo for StarChat2 15B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 10 items • Updated Apr 12, 2024 • 13
PERL: Parameter Efficient Reinforcement Learning from Human Feedback Paper • 2403.10704 • Published Mar 15, 2024 • 59