Update README.md
Browse files
README.md
CHANGED
|
@@ -9,5 +9,50 @@ tags:
|
|
| 9 |
- dpo
|
| 10 |
---
|
| 11 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
This model is based on teknium/OpenHermes-2.5-Mistral-7B, DPO fine-tuned with the H4rmony_dpo dataset.
|
| 13 |
-
Its completions should be more ecologically aware than the base model.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
- dpo
|
| 10 |
---
|
| 11 |
|
| 12 |
+
# Model Details
|
| 13 |
+
|
| 14 |
+

|
| 15 |
+
|
| 16 |
+
# Model Description
|
| 17 |
+
|
| 18 |
This model is based on teknium/OpenHermes-2.5-Mistral-7B, DPO fine-tuned with the H4rmony_dpo dataset.
|
| 19 |
+
Its completions should be more ecologically aware than the base model.
|
| 20 |
+
|
| 21 |
+
Developed by: Jorge Vallego
|
| 22 |
+
Funded by : Neovalle Ltd.
|
| 23 |
+
Shared by : [email protected]
|
| 24 |
+
Model type: mistral
|
| 25 |
+
Language(s) (NLP): Primarily English
|
| 26 |
+
License: MIT
|
| 27 |
+
Finetuned from model: teknium/OpenHermes-2.5-Mistral-7B
|
| 28 |
+
Methodology: DPO
|
| 29 |
+
|
| 30 |
+
# Uses
|
| 31 |
+
|
| 32 |
+
Intended as PoC to show the effects of H4rmony_dpo dataset with DPO fine-tuning.
|
| 33 |
+
|
| 34 |
+
# Direct Use
|
| 35 |
+
|
| 36 |
+
For testing purposes to gain insight in order to help with the continous improvement of the H4rmony_dpo dataset.
|
| 37 |
+
|
| 38 |
+
# Downstream Use
|
| 39 |
+
|
| 40 |
+
Its direct use in applications is not recommended as this model is under testing for a specific task only (Ecological Alignment)
|
| 41 |
+
Out-of-Scope Use
|
| 42 |
+
|
| 43 |
+
Not meant to be used other than testing and evaluation of the H4rmony_dpo dataset and ecological alignment.
|
| 44 |
+
Bias, Risks, and Limitations
|
| 45 |
+
|
| 46 |
+
This model might produce biased completions already existing in the base model, and others unintentionally introduced during fine-tuning.
|
| 47 |
+
|
| 48 |
+
# How to Get Started with the Model
|
| 49 |
+
|
| 50 |
+
It can be loaded and run in a Colab instance with High RAM.
|
| 51 |
+
|
| 52 |
+
# Training Details
|
| 53 |
+
|
| 54 |
+
Trained using DPO
|
| 55 |
+
|
| 56 |
+
# Training Data
|
| 57 |
+
|
| 58 |
+
H4rmony Dataset - https://huggingface.co/datasets/neovalle/H4rmony_dpo
|