4 25 15

Le Yu

vanillaOVO

https://yule-buaa.github.io/

yule-BUAA

AI & ML interests

None yet

Recent Activity

upvoted a paper 22 days ago

Beyond Turn Limits: Training Deep Search Agents with Dynamic Context Window

upvoted a paper 3 months ago

Agentic Reinforced Policy Optimization

upvoted a paper 3 months ago

Group Sequence Policy Optimization

View all activity

Organizations

None yet

upvoted a paper 22 days ago

Beyond Turn Limits: Training Deep Search Agents with Dynamic Context Window

Paper • 2510.08276 • Published 23 days ago • 9

upvoted 2 papers 3 months ago

Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published Jul 26 • 156

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 306

authored a paper 3 months ago

RefCritic: Training Long Chain-of-Thought Critic Models with Refinement Feedback

Paper • 2507.15024 • Published Jul 20 • 14

liked a model 3 months ago

Qwen/Qwen3-Coder-480B-A35B-Instruct

Text Generation • 480B • Updated Aug 21 • 39.8k • • 1.23k

upvoted a paper 3 months ago

RefCritic: Training Long Chain-of-Thought Critic Models with Refinement Feedback

Paper • 2507.15024 • Published Jul 20 • 14

upvoted a collection 4 months ago

Qwen3

Collection

84 items • Updated Aug 6 • 1.38k

upvoted 2 papers 5 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 420

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 262

authored 2 papers 5 months ago

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14 • 308

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 185

upvoted a paper 5 months ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 185

liked 2 models 5 months ago

deepseek-ai/DeepSeek-R1-0528-Qwen3-8B

Text Generation • 8B • Updated May 29 • 112k • • 971

deepseek-ai/DeepSeek-R1-0528

Text Generation • 685B • Updated May 29 • 542k • • 2.38k

upvoted a paper 6 months ago

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14 • 308

authored 5 papers 6 months ago

Le Yu

AI & ML interests

Recent Activity

Organizations

vanillaOVO's activity