9 10 46

Patrick Haller PRO

PatrickHaller

HallerPatrick

AI & ML interests

NLP, Language Models, Autoregressive Models

Recent Activity

liked a Space 26 days ago

PisaBench/pisa-bench-leaderboard

liked a dataset 26 days ago

PisaBench/pisa-bench

updated a Space 26 days ago

PisaBench/pisa-bench-leaderboard

View all activity

Organizations

upvoted a paper about 2 months ago

What Layers When: Learning to Skip Compute in LLMs with Residual Gates

Paper • 2510.13876 • Published Oct 13 • 11

upvoted a collection 4 months ago

Hybrid Linear Attention Research

Collection

All 1.3B & 340M hybrid linear-attention experiments. • 62 items • Updated Sep 11 • 12

upvoted an article 4 months ago

Article

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

May 21

•

upvoted a collection 12 months ago

fuck quadratic attention

Collection

11 items • Updated Apr 24, 2024 • 25

upvoted a collection about 1 year ago

Common Models

Collection

The first generation of models pretrained on Common Corpus. • 5 items • Updated Dec 5, 2024 • 41

upvoted 5 papers over 1 year ago

Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

Paper • 2406.17660 • Published Jun 25, 2024 • 5

Efficient Continual Pre-training by Mitigating the Stability Gap

Paper • 2406.14833 • Published Jun 21, 2024 • 20

Scaling Laws for Linear Complexity Language Models

Paper • 2406.16690 • Published Jun 24, 2024 • 23

SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

Fabricator: An Open Source Toolkit for Generating Labeled Training Data with Teacher LLMs

Paper • 2309.09582 • Published Sep 18, 2023 • 4

Patrick Haller PRO

AI & ML interests

Recent Activity

Organizations

PatrickHaller's activity

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance