Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Dan Busbridge's picture
1 1

Dan Busbridge

dbusbridge
huggingbitch's profile picture 21world's profile picture Kseniase's profile picture
·
  • danbusbridge
  • dbusbridge

AI & ML interests

Deep learning, optimization, self-supervised learning, representation learning, large language modeling, equivariance, geometric deep learning, attention mechanisms, transformers

Organizations

Apple's profile picture

authored 6 papers 10 months ago

Stabilizing Transformer Training by Preventing Attention Entropy Collapse

Paper • 2303.06296 • Published Mar 11, 2023 • 1

The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning

Paper • 2307.10907 • Published Jul 20, 2023 • 8

Poly-View Contrastive Learning

Paper • 2403.05490 • Published Mar 8, 2024

Position Prediction as an Effective Pretraining Strategy

Paper • 2207.07611 • Published Jul 15, 2022 • 1

Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12 • 47

Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Paper • 2501.12370 • Published Jan 21 • 11
authored a paper about 1 year ago

Theory, Analysis, and Best Practices for Sigmoid Self-Attention

Paper • 2409.04431 • Published Sep 6, 2024 • 2
authored a paper over 2 years ago

How to Scale Your EMA

Paper • 2307.13813 • Published Jul 25, 2023 • 9
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs