arxiv:2509.03059
Junxiao Yang
yangjunxiao2021
AI & ML interests
Alignment/AI safety
Recent Activity
upvoted
a
paper
4 days ago
It Takes Two: Your GRPO Is Secretly DPO
upvoted
a
collection
4 days ago
Agent & RL
upvoted
a
paper
4 days ago
Glyph: Scaling Context Windows via Visual-Text Compression