SRFT SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning Paper • 2506.19767 • Published Jun 24 • 15 Yuqian-Fu/SRFT-Qwen2.5-Math-7B Text Generation • 8B • Updated Jul 24 • 14 • 3 Yuqian-Fu/SRFT-Qwen2.5-7B-Instruct 8B • Updated Jul 24 • 2 Yuqian-Fu/SRFT-Qwen2.5-Math-1.5B 2B • Updated Jul 24 • 2
SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning Paper • 2506.19767 • Published Jun 24 • 15
SRFT SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning Paper • 2506.19767 • Published Jun 24 • 15 Yuqian-Fu/SRFT-Qwen2.5-Math-7B Text Generation • 8B • Updated Jul 24 • 14 • 3 Yuqian-Fu/SRFT-Qwen2.5-7B-Instruct 8B • Updated Jul 24 • 2 Yuqian-Fu/SRFT-Qwen2.5-Math-1.5B 2B • Updated Jul 24 • 2
SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning Paper • 2506.19767 • Published Jun 24 • 15