arxiv:2507.16331
Binhang Yuan
biyuan
AI & ML interests
ML System
Recent Activity
authored
a paper
3 months ago
FlexGen: High-Throughput Generative Inference of Large Language Models
with a Single GPU
authored
a paper
3 months ago
Auto-Differentiation of Relational Computations for Very Large Scale
Machine Learning