Guoheng Sun's picture

6 365

Guoheng Sun

s1ghhh

·

s1ghhh

AI & ML interests

None yet

Recent Activity

liked a model 4 days ago

deepseek-ai/DeepSeek-OCR

liked a Space 7 days ago

facebook/vggt

liked a model 7 days ago

facebook/VGGT-1B-Commercial

View all activity

Organizations

upvoted a paper 5 months ago

CoIn: Counting the Invisible Reasoning Tokens in Commercial Opaque LLM APIs

Paper • 2505.13778 • Published May 19 • 5

upvoted an article 8 months ago

Article

From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate

Jun 13, 2024

• 60

upvoted an article 9 months ago

Article

Open R1: Update #2

By

and 6 others •

Feb 10

• 218

upvoted a collection 12 months ago

LLM-Drop

Model weights of paper "What Matters in Transformers? Not All Attention is Needed" (https://arxiv.org/abs/2406.15786) • 14 items • Updated Oct 23, 2024 • 4

upvoted a paper about 1 year ago

What Matters in Transformers? Not All Attention is Needed

Paper • 2406.15786 • Published Jun 22, 2024 • 31

upvoted a collection about 1 year ago

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated May 5 • 236