Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Alex Makelov's picture

Alex Makelov

amakelov
https://amakelov.github.io
  • AMakelov
  • amakelov
  • amakelov

AI & ML interests

Interpretability

Organizations

None yet

authored a paper 10 months ago

Towards Deep Learning Models Resistant to Adversarial Attacks

Paper • 1706.06083 • Published Jun 19, 2017
authored 2 papers about 1 year ago

Is This the Subspace You Are Looking for? An Interpretability Illusion for Subspace Activation Patching

Paper • 2311.17030 • Published Nov 28, 2023

Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control

Paper • 2405.08366 • Published May 14, 2024 • 2
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs