HKCanto-Eval: A Benchmark for Evaluating Cantonese Language Understanding and Cultural Comprehension in LLMs Paper • 2503.12440 • Published Mar 16
Self-Correction Bench: Revealing and Addressing the Self-Correction Blind Spot in LLMs Paper • 2507.02778 • Published Jul 3 • 9