ZhuofengLi's picture

ZhuofengLi PRO

ZhuofengLi

·

https://github.com/Zhuofeng-Li

AI & ML interests

Agents, Reasoning LLMs/VLLMs, RL

Organizations

ZhuofengLi 's models 13

ZhuofengLi/torl-qwen2.5-7b-instruct

8B • Updated Sep 11 • 2

ZhuofengLi/octo-science-qwen2.5-7b-grpo-step-40-v2

2B • Updated Aug 3 • 6

ZhuofengLi/octo-search-qwen2.5-7b-grpo-155-step-v1

8B • Updated Jul 29 • 5

ZhuofengLi/octo-search-qwen2.5-7b-grpo-step-60-v1.5

2B • Updated Jul 28 • 6

ZhuofengLi/tool-n1-multi-turn-reason-lora-sft-1180-step

Text Generation • 8B • Updated Jul 14 • 5

ZhuofengLi/xlam-reason-lora-sft-1340-step

Text Generation • 3B • Updated Jul 13 • 6

ZhuofengLi/tool-n1-reason-lora-sft-800-step

Text Generation • 8B • Updated Jul 4 • 7

ZhuofengLi/pot-r1-grpo-qwen2.5-7b-Instruct

Text Generation • 8B • Updated Mar 30 • 3

ZhuofengLi/pot-r1-grpo-qwen2.5-1.5b-Instruct

Text Generation • 2B • Updated Mar 30

ZhuofengLi/pot-r1-grpo-qwen2.5-1.5b-Instruct-wo-warmup

Text Generation • 2B • Updated Mar 28

ZhuofengLi/Qwen2.5-1.5B-Open-R1-GRPO

ZhuofengLi/pot-r1-grpo-qwen2.5-7b-Instruct-wo-warmup

Text Generation • 8B • Updated Mar 25

ZhuofengLi/SciBART-original

Updated Jul 4, 2024