Update README.md
Browse files
README.md
CHANGED
|
@@ -8,7 +8,7 @@ tags: []
|
|
| 8 |
|
| 9 |
## Model Description
|
| 10 |
|
| 11 |
-
This is the SFT model
|
| 12 |
|
| 13 |
Two mains stages are involved in our MoAA method. In the first stage, we employ MoA to produce high-quality synthetic data for supervised fine-tuning. In the second stage, we combines multiple LLMs as a reward model to provide preference annotations.
|
| 14 |
|
|
@@ -63,7 +63,7 @@ Refer to [Paper](https://arxiv.org/abs/2505.03059) for metrics.
|
|
| 63 |
|
| 64 |
|
| 65 |
|
| 66 |
-
## Citation
|
| 67 |
```
|
| 68 |
@article{wang2025improving,
|
| 69 |
title = {Improving Model Alignment Through Collective Intelligence of Open-Source LLMS},
|
|
|
|
| 8 |
|
| 9 |
## Model Description
|
| 10 |
|
| 11 |
+
This is the SFT model in our Mixture of Agents Alignment (MoAA) pipeline. This model is tuned on the Llama-3.1-8b-Instruct. MoAA is an approach that leverages collective intelligence from open‑source LLMs to advance alignment.
|
| 12 |
|
| 13 |
Two mains stages are involved in our MoAA method. In the first stage, we employ MoA to produce high-quality synthetic data for supervised fine-tuning. In the second stage, we combines multiple LLMs as a reward model to provide preference annotations.
|
| 14 |
|
|
|
|
| 63 |
|
| 64 |
|
| 65 |
|
| 66 |
+
## Citation
|
| 67 |
```
|
| 68 |
@article{wang2025improving,
|
| 69 |
title = {Improving Model Alignment Through Collective Intelligence of Open-Source LLMS},
|