6 18 14

Jiawei Liu

ganler

https://jw-liu.xyz/

AI & ML interests

Simplifying the making of great software.

Recent Activity

upvoted a paper about 2 months ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

upvoted an article 2 months ago

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

published a dataset 4 months ago

purpcode/ctxdistill-verified-ablation-Qwen2.5-14B-Instruct-1M-73k

View all activity

Organizations

upvoted a paper about 2 months ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published Oct 9 • 36

upvoted an article 2 months ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Feb 7

•

255

published 3 datasets 4 months ago

updated 2 datasets 4 months ago

purpcode/ctxdistill-verified-Qwen2.5-14B-Instruct-1M-57k

Viewer • Updated Aug 9 • 57.7k • 36

purpcode/ctxdistill-verified-Qwen2.5-32B-Instruct-55k

Viewer • Updated Aug 9 • 55.6k • 20

updated a Space 4 months ago

README

🦀

updated a collection 4 months ago

Paper

Collection

1 item • Updated Aug 5

updated a dataset 4 months ago

purpcode/ctxdistill-verified-ablation-Qwen2.5-14B-Instruct-1M-73k

Viewer • Updated Aug 5 • 74k • 14

updated a collection 4 months ago

PurpCode Models

Collection

4 items • Updated Aug 5

published a Space 4 months ago

README

🦀

published 2 models 4 months ago

purpcode/purpcode-14b-rule-sft

Text Generation • 15B • Updated Jul 31 • 19

purpcode/purpcode-32b-rule-sft

Text Generation • 33B • Updated Jul 31 • 9

updated 2 models 4 months ago

purpcode/purpcode-32b-rule-sft

Text Generation • 33B • Updated Jul 31 • 9

purpcode/purpcode-14b-rule-sft

Text Generation • 15B • Updated Jul 31 • 19

published a model 4 months ago

purpcode/purpcode-32b-rl

Text Generation • 33B • Updated Jul 31 • 49

Jiawei Liu

AI & ML interests

Recent Activity

Organizations

ganler's activity

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

README

README