Update README.md
Browse files
README.md
CHANGED
|
@@ -21,6 +21,8 @@ tags:
|
|
| 21 |
|
| 22 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 23 |
|
|
|
|
|
|
|
| 24 |
SteamSHP-XL is a preference model trained to predict -- given some context and two possible responses -- which response humans will find more helpful.
|
| 25 |
It can be used for NLG evaluation or as a reward model for RLHF.
|
| 26 |
|
|
|
|
| 21 |
|
| 22 |
<!-- Provide a quick summary of what the model is/does. -->
|
| 23 |
|
| 24 |
+
**If you mention this dataset in a paper, please cite the paper:** [Understanding Dataset Difficulty with V-Usable Information (ICML 2022)](https://proceedings.mlr.press/v162/ethayarajh22a.html).
|
| 25 |
+
|
| 26 |
SteamSHP-XL is a preference model trained to predict -- given some context and two possible responses -- which response humans will find more helpful.
|
| 27 |
It can be used for NLG evaluation or as a reward model for RLHF.
|
| 28 |
|