Yulai Zhao's picture

4 4

Yulai Zhao

sarosavo

·

http://yulaizhao.com

AI & ML interests

None yet

Recent Activity

updated a dataset about 1 month ago

sarosavo/RLEV

authored a paper about 2 months ago

Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values

upvoted a paper about 2 months ago

Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values

View all activity

Organizations

updated a dataset about 1 month ago

sarosavo/RLEV

Viewer • Updated Oct 27 • 215k • 125

authored a paper about 2 months ago

Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values

Paper • 2510.20187 • Published Oct 23 • 18

upvoted a paper about 2 months ago

Every Question Has Its Own Value: Reinforcement Learning with Explicit Human Values

Paper • 2510.20187 • Published Oct 23 • 18

published a dataset about 2 months ago

sarosavo/RLEV

Viewer • Updated Oct 27 • 215k • 125

upvoted 2 papers 3 months ago

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published Sep 1 • 75

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published Aug 31 • 84

upvoted a paper 5 months ago

One Token to Fool LLM-as-a-Judge

Paper • 2507.08794 • Published Jul 11 • 31

authored 3 papers 5 months ago

Provably Efficient CVaR RL in Low-rank MDPs

Paper • 2311.11965 • Published Nov 20, 2023

Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding

Paper • 2408.08252 • Published Aug 15, 2024 • 1

One Token to Fool LLM-as-a-Judge

Paper • 2507.08794 • Published Jul 11 • 31

updated a dataset 5 months ago

sarosavo/Master-RM

Viewer • Updated Jul 15 • 180k • 90 • 10

New activity in sarosavo/Master-RM 5 months ago

Add library_name to metadata

#3 opened 5 months ago by

updated a model 5 months ago

sarosavo/Master-RM

Text Classification • 8B • Updated Jul 15 • 92 • 16

New activity in sarosavo/Master-RM 5 months ago

Add pipeline tag and GitHub link to model card

#2 opened 5 months ago by

New activity in sarosavo/Master-RM 5 months ago

Improve dataset card: Update task category, add description and relevant tags

#2 opened 5 months ago by

New activity in sarosavo/Master-RM 5 months ago

Improve model card with metadata, links, and usage example

#1 opened 5 months ago by

published a dataset 5 months ago

sarosavo/Master-RM

Viewer • Updated Jul 15 • 180k • 90 • 10

published a model 5 months ago

sarosavo/Master-RM

Text Classification • 8B • Updated Jul 15 • 92 • 16

authored a paper almost 2 years ago

Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning

Paper • 2305.04819 • Published May 8, 2023