Xiangyan Liu's picture

1 26

Xiangyan Liu

xyliu6

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 22 days ago

GEM: A Gym for Agentic LLMs

upvoted a paper 23 days ago

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

upvoted a paper 25 days ago

Variational Reasoning for Language Models

View all activity

Organizations

None yet

upvoted a paper 22 days ago

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published 22 days ago • 86

upvoted a paper 23 days ago

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Paper • 2509.24002 • Published 25 days ago • 166

upvoted 2 papers 25 days ago

Variational Reasoning for Language Models

Paper • 2509.22637 • Published 27 days ago • 68

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published 27 days ago • 67

upvoted a paper 4 months ago

Balancing Truthfulness and Informativeness with Uncertainty-Aware Instruction Fine-Tuning

Paper • 2502.11962 • Published Feb 17 • 38

upvoted 10 papers 5 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 262

SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis

Paper • 2506.02096 • Published Jun 2 • 52

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 184

Sherlock: Self-Correcting Reasoning in Vision-Language Models

Paper • 2505.22651 • Published May 28 • 50

Fostering Video Reasoning via Next-Event Prediction

Paper • 2505.22457 • Published May 28 • 29

Lifelong Safety Alignment for Language Models

Paper • 2505.20259 • Published May 26 • 23

Reinforcement Fine-Tuning Powers Reasoning Capability of Multimodal Large Language Models

Paper • 2505.18536 • Published May 24 • 18

Optimizing Anytime Reasoning via Budget Relative Policy Optimization

Paper • 2505.13438 • Published May 19 • 36

DanceGRPO: Unleashing GRPO on Visual Generation

Paper • 2505.07818 • Published May 12 • 32

REFINE-AF: A Task-Agnostic Framework to Align Language Models via Self-Generated Instructions using Reinforcement Learning from Automated Feedback

Paper • 2505.06548 • Published May 10 • 30

upvoted a paper 6 months ago

FlowReasoner: Reinforcing Query-Level Meta-Agents

Paper • 2504.15257 • Published Apr 21 • 47

upvoted 2 collections 6 months ago

🚀 Active PRM

Efficient Process Reward Model Training via Active Learning. • 4 items • Updated Apr 16 • 3

NoisyRollout

8 items • Updated May 20 • 6

upvoted a paper 6 months ago

NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation

Paper • 2504.13055 • Published Apr 17 • 19

upvoted a paper 7 months ago

Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26 • 56