Update README.md
Browse files
README.md
CHANGED
|
@@ -12,13 +12,15 @@ pipeline_tag: text-classification
|
|
| 12 |
<hr>
|
| 13 |
<div align="center" style="line-height: 1;">
|
| 14 |
<a href="https://arxiv.org/abs/2507.01352" target="_blank">
|
| 15 |
-
<img alt="Paper" src="https://img.shields.io/badge
|
| 16 |
</a>
|
| 17 |
<a href="https://huggingface.co/collections/Skywork/skywork-reward-v2-685cc86ce5d9c9e4be500c84" target="_blank">
|
| 18 |
-
<img alt="Models" src="https://img.shields.io/badge/π€_Hugging_Face-
|
|
|
|
|
|
|
|
|
|
| 19 |
</a>
|
| 20 |
</div>
|
| 21 |
-
|
| 22 |
## π₯ Highlights
|
| 23 |
|
| 24 |
**Skywork-Reward-V2** is a series of eight reward models designed for versatility across a wide range of tasks, trained on a mixture of 26 million carefully curated preference pairs. While the Skywork-Reward-V2 series remains based on the Bradley-Terry model, we push the boundaries of training data scale and quality to achieve superior performance. Compared to the first generation of Skywork-Reward, the Skywork-Reward-V2 series offers the following major improvements:
|
|
|
|
| 12 |
<hr>
|
| 13 |
<div align="center" style="line-height: 1;">
|
| 14 |
<a href="https://arxiv.org/abs/2507.01352" target="_blank">
|
| 15 |
+
<img alt="Paper" src="https://img.shields.io/badge/π_Paper-Skywork--Reward--V2-4D5EFF?style=flat-square&labelColor=202124"/>
|
| 16 |
</a>
|
| 17 |
<a href="https://huggingface.co/collections/Skywork/skywork-reward-v2-685cc86ce5d9c9e4be500c84" target="_blank">
|
| 18 |
+
<img alt="Models" src="https://img.shields.io/badge/π€_Hugging_Face-Model_Collection-4D5EFF?style=flat-square&labelColor=202124"/>
|
| 19 |
+
</a>
|
| 20 |
+
<a href="https://github.com/SkyworkAI/Skywork-Reward-V2" target="_blank">
|
| 21 |
+
<img alt="GitHub" src="https://img.shields.io/badge/π§βπ»_GitHub-Skywork--Reward--V2-4D5EFF?style=flat-square&labelColor=202124"/>
|
| 22 |
</a>
|
| 23 |
</div>
|
|
|
|
| 24 |
## π₯ Highlights
|
| 25 |
|
| 26 |
**Skywork-Reward-V2** is a series of eight reward models designed for versatility across a wide range of tasks, trained on a mixture of 26 million carefully curated preference pairs. While the Skywork-Reward-V2 series remains based on the Bradley-Terry model, we push the boundaries of training data scale and quality to achieve superior performance. Compared to the first generation of Skywork-Reward, the Skywork-Reward-V2 series offers the following major improvements:
|