Hi, I’d like to register our benchmark dataset internlm/WildClawBench as an official benchmark on the Hub. We have added eval.yaml to the repo with evaluation_framework: wildclawbench. Could you please add it to the benchmark allow-list? Thanks!
Hi,
Thanks for reaching out! Your dataset internlm/WildClawBench and the included eval.yaml look good. We can add it to the official benchmark allow-list.
Before we do, please ensure that:
-
The repository follows the Hub’s benchmark submission guidelines.
-
The
eval.yamlincludes all required fields and a working evaluation script. -
Any dependencies or instructions for reproducing the benchmark are clearly documented.
Once confirmed, we’ll proceed with adding it to the allow-list and it should appear as an official benchmark on the Hub.
Thanks for contributing this!
Hi,
Thanks for the update! I’ve double-checked everything to ensure the repo, eval.yaml, and documentation are all fully aligned with the Hub’s guidelines.
The benchmark is ready for the allow-list. Looking forward to seeing it live! ![]()