YangWang92's picture

YangWang92

yangwang92

·

AI & ML interests

None yet

Recent Activity

liked a model 4 days ago

deepseek-ai/DeepSeek-OCR

upvoted a paper 4 days ago

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

upvoted a paper 10 days ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

View all activity

Organizations

upvoted a paper 4 days ago

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Paper • 2505.06708 • Published May 10 • 4

upvoted 2 papers 10 days ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published 10 days ago • 164

Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published 10 days ago • 157

upvoted a paper 16 days ago

Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs

Paper • 2507.07996 • Published Jul 10 • 34

upvoted a collection 16 days ago

L1

L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning • 7 items • Updated Jul 13 • 8

upvoted a paper 16 days ago

TOUCAN: Synthesizing 1.5M Tool-Agentic Data from Real-World MCP Environments

Paper • 2510.01179 • Published 22 days ago • 24

upvoted a paper 28 days ago

Soft Tokens, Hard Truths

Paper • 2509.19170 • Published about 1 month ago • 15

upvoted a collection about 1 month ago

Qwen3-Next

4 items • Updated Sep 22 • 145

upvoted 2 papers about 2 months ago

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published Aug 20 • 36

TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

Paper • 2508.17445 • Published Aug 24 • 80

upvoted a collection about 2 months ago

NVIDIA Nemotron

Open, Production-ready Enterprise Models. Nvidia Open Model license. • 5 items • Updated 2 days ago • 63

upvoted a paper 2 months ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 150

upvoted 2 collections 2 months ago

Recurrent Models

These are checkpoints for recurrent LLMs developed to scale test-time compute by recurring in latent space. • 15 items • Updated May 21 • 10

Web-SSL

17 items • Updated Apr 23 • 19

upvoted 2 collections 3 months ago

Physics of Language Models: Part 4.2

16 items • Updated Jul 29 • 6

GLM-4.5

GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai • 11 items • Updated Aug 11 • 245

upvoted 4 papers 3 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 306

Step-Audio 2 Technical Report

Paper • 2507.16632 • Published Jul 22 • 72

Scaling Laws for Optimal Data Mixtures

Paper • 2507.09404 • Published Jul 12 • 35

Not All Correct Answers Are Equal: Why Your Distillation Source Matters

Paper • 2505.14464 • Published May 20 • 9