9 126 276

YangWang92

yangwang92

AI & ML interests

None yet

Recent Activity

liked a model 5 days ago

deepseek-ai/DeepSeek-OCR

upvoted a paper 5 days ago

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

upvoted a paper 11 days ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

View all activity

Organizations

liked a model 5 days ago

deepseek-ai/DeepSeek-OCR

Image-Text-to-Text • 3B • Updated 1 day ago • 488k • 1.73k

upvoted a paper 5 days ago

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Paper • 2505.06708 • Published May 10 • 4

upvoted a paper 11 days ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published 11 days ago • 164

liked a model 11 days ago

nyu-visionx/RAE-collections

Updated 11 days ago • 30

upvoted a paper 11 days ago

Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published 11 days ago • 157

liked a model 14 days ago

inclusionAI/Ring-flash-linear-2.0

Text Generation • 104B • Updated 1 day ago • 771 • 96

liked a dataset 15 days ago

allenai/dolmino-mix-1124

Viewer • Updated 2 days ago • 170M • 17.1k • 79

liked 3 models 16 days ago

upvoted a paper 17 days ago

Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs

Paper • 2507.07996 • Published Jul 10 • 34

upvoted a collection 17 days ago

L1

Collection

L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning • 7 items • Updated Jul 13 • 8

liked 2 models 17 days ago

Qwen/Qwen3-1.7B-Base

Text Generation • 2B • Updated Jul 26 • 216k • 46

open-thoughts/OpenThinker3-1.5B

Text Generation • 2B • Updated Jul 11 • 1.39k • 10

upvoted a paper 17 days ago

TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments

Paper • 2510.01179 • Published 23 days ago • 24

liked a dataset 21 days ago

facebook/natural_reasoning

Viewer • Updated Feb 21 • 1.15M • 3.38k • 538

liked 2 models 25 days ago

inclusionAI/Ring-1T-preview

Text Generation • 1000B • Updated 2 days ago • 2.9k • 266

deepseek-ai/DeepSeek-V3.2-Exp

Text Generation • 685B • Updated 16 days ago • 91.2k • • 725

liked a model 29 days ago

Qwen/Qwen3-4B

Text Generation • 4B • Updated Jul 26 • 1.4M • • 420

upvoted a paper 30 days ago

Soft Tokens, Hard Truths

Paper • 2509.19170 • Published Sep 23 • 15

YangWang92

AI & ML interests

Recent Activity

Organizations

yangwang92's activity