Bowen's picture

6 2

Bowen

PeterJinGo

·

AI & ML interests

None yet

Organizations

upvoted a paper 5 months ago

MIRIX: Multi-Agent Memory System for LLM-Based Agents

Paper • 2507.07957 • Published Jul 10 • 79

upvoted a collection 6 months ago

Search-R1-v0.3

RL with outcome reward + format reward. https://arxiv.org/abs/2505.15117 • 12 items • Updated Aug 12 • 2

upvoted a paper 7 months ago

RM-R1: Reward Modeling as Reasoning

Paper • 2505.02387 • Published May 5 • 80

upvoted 2 collections 8 months ago

Search-R1-v0.2

Exploration with a more stable RL pipeline with outcome-only reward and scaled-up LLMs. https://arxiv.org/abs/2503.09516 • 26 items • Updated Aug 12 • 4

Search-R1

Preliminary checkpoints with outcome-only RL. • 15 items • Updated Aug 12 • 12

upvoted a paper 9 months ago

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36