Update README.md
Browse files
README.md
CHANGED
|
@@ -101,7 +101,7 @@ base_model:
|
|
| 101 |
- FacebookAI/xlm-roberta-large
|
| 102 |
---
|
| 103 |
|
| 104 |
-
# COMET-poly-
|
| 105 |
|
| 106 |
This model is based on [COMET-poly](https://github.com/zouharvi/COMET-poly), which is a fork but not compatible with original Unbabel's COMET.
|
| 107 |
To run the model, you need to first install this version of COMET either with:
|
|
@@ -115,38 +115,57 @@ cd COMET-poly
|
|
| 115 |
pip3 install -e comet_poly
|
| 116 |
```
|
| 117 |
|
| 118 |
-
This model scores the translation `mt` but takes additional in-context example:
|
| 119 |
```python
|
| 120 |
import comet_poly
|
| 121 |
-
model = comet_poly.load_from_checkpoint(comet_poly.download_model("zouharvi/COMET-poly-
|
| 122 |
data = [
|
| 123 |
{
|
| 124 |
"src": "Iceberg lettuce got its name in the 1920s when it was shipped packed in ice to stay fresh.",
|
| 125 |
"mt": "Eisbergsalat erhielt seinen Namen in den 1920er-Jahren, als er in Eis verpackt verschickt wurde, um frisch zu bleiben.",
|
| 126 |
"src2": "Lettuce is mostly water, which helps keep it crisp when chilled.",
|
| 127 |
"mt2": "Kopfsalat besteht größtenteils aus Wasser, was ihm hilft, beim Kühlen knackig zu bleiben.",
|
| 128 |
-
"score2": 94.5
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 129 |
},
|
| 130 |
{
|
| 131 |
"src": "Goats have rectangular pupils, which give them a wide field of vision—up to 320 degrees!",
|
| 132 |
"mt": "Kozy mají obdélníkové zornice, což jim umožňuje vidět skoro všude kolem sebe, aniž by musely otáčet hlavou.",
|
| 133 |
"src2": "Sheep, like goats, also have rectangular pupils for better peripheral vision.",
|
| 134 |
"mt2": "Вівці, як і кози, також мають прямокутні зіниці для кращого периферичного зору.",
|
| 135 |
-
"score2": 96.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136 |
},
|
| 137 |
{
|
| 138 |
"src": "This helps them spot predators from almost all directions without moving their heads.",
|
| 139 |
"mt": "Điều này giúp chúng phát hiện kẻ săn mồi từ gần như mọi hướng mà không cần quay đầu.",
|
| 140 |
"src2": "Many prey animals have evolved to detect threats with minimal movement.",
|
| 141 |
"mt2": "Nhiều động vật thịt có tiến hóa để xem mối nguy bằng nhỏ đi lại.",
|
| 142 |
-
"score2": 42.3
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 143 |
}
|
| 144 |
]
|
|
|
|
| 145 |
print("scores", model.predict(data, batch_size=8, gpus=1).scores)
|
| 146 |
```
|
| 147 |
Outputs:
|
| 148 |
```
|
| 149 |
-
scores [98.
|
| 150 |
```
|
| 151 |
|
| 152 |
You can use a readily-available training data to do the on-the-fly retrieval.
|
|
@@ -173,7 +192,7 @@ data_kb = list(datasets.load_dataset("zouharvi/wmt-human-all", split="train"))
|
|
| 173 |
data_retrieved = comet_poly.retrieval.retrieve_from_kb(
|
| 174 |
data=data,
|
| 175 |
data_kb=data_kb,
|
| 176 |
-
k=
|
| 177 |
prevent_hardmatch=False,
|
| 178 |
key="src",
|
| 179 |
)
|
|
|
|
| 101 |
- FacebookAI/xlm-roberta-large
|
| 102 |
---
|
| 103 |
|
| 104 |
+
# COMET-poly-ic3-wmt25
|
| 105 |
|
| 106 |
This model is based on [COMET-poly](https://github.com/zouharvi/COMET-poly), which is a fork but not compatible with original Unbabel's COMET.
|
| 107 |
To run the model, you need to first install this version of COMET either with:
|
|
|
|
| 115 |
pip3 install -e comet_poly
|
| 116 |
```
|
| 117 |
|
| 118 |
+
This model scores the translation `mt` but takes additional three in-context example: sources `src2`, `src3`, `src4`, translations `mt2`, `mt3`, `mt4`, and scores `score2`, `score3`, `score4`, which makes it a better quality estimator:
|
| 119 |
```python
|
| 120 |
import comet_poly
|
| 121 |
+
model = comet_poly.load_from_checkpoint(comet_poly.download_model("zouharvi/COMET-poly-ic3-wmt25"))
|
| 122 |
data = [
|
| 123 |
{
|
| 124 |
"src": "Iceberg lettuce got its name in the 1920s when it was shipped packed in ice to stay fresh.",
|
| 125 |
"mt": "Eisbergsalat erhielt seinen Namen in den 1920er-Jahren, als er in Eis verpackt verschickt wurde, um frisch zu bleiben.",
|
| 126 |
"src2": "Lettuce is mostly water, which helps keep it crisp when chilled.",
|
| 127 |
"mt2": "Kopfsalat besteht größtenteils aus Wasser, was ihm hilft, beim Kühlen knackig zu bleiben.",
|
| 128 |
+
"score2": 94.5,
|
| 129 |
+
"src3": "Iceberg lettuce is often used in burgers for its crunch and mild flavor.",
|
| 130 |
+
"mt3": "Íssalat er oft notað í hamborgara vegna stökkleika og milds bragðs.",
|
| 131 |
+
"score3": 92.0,
|
| 132 |
+
"src4": "Farmers harvest lettuce early in the morning to keep it fresh longer.",
|
| 133 |
+
"mt4": "Les agriculteurs récoltent la laitue tôt le matin pour la garder fraîche plus longtemps.",
|
| 134 |
+
"score4": 82.3
|
| 135 |
},
|
| 136 |
{
|
| 137 |
"src": "Goats have rectangular pupils, which give them a wide field of vision—up to 320 degrees!",
|
| 138 |
"mt": "Kozy mají obdélníkové zornice, což jim umožňuje vidět skoro všude kolem sebe, aniž by musely otáčet hlavou.",
|
| 139 |
"src2": "Sheep, like goats, also have rectangular pupils for better peripheral vision.",
|
| 140 |
"mt2": "Вівці, як і кози, також мають прямокутні зіниці для кращого периферичного зору.",
|
| 141 |
+
"score2": 96.0,
|
| 142 |
+
"src3": "This unique eye shape helps them detect predators early.",
|
| 143 |
+
"mt3": "Ця унікальна форма очей допомагає їм рано виявляти хижаків.",
|
| 144 |
+
"score3": 93.2,
|
| 145 |
+
"src4": "Goats' vision stays stable even when they climb steep surfaces.",
|
| 146 |
+
"mt4": "Kozí vidění zůstává stabilní když jdou na vysoký svah.",
|
| 147 |
+
"score4": 67.4
|
| 148 |
},
|
| 149 |
{
|
| 150 |
"src": "This helps them spot predators from almost all directions without moving their heads.",
|
| 151 |
"mt": "Điều này giúp chúng phát hiện kẻ săn mồi từ gần như mọi hướng mà không cần quay đầu.",
|
| 152 |
"src2": "Many prey animals have evolved to detect threats with minimal movement.",
|
| 153 |
"mt2": "Nhiều động vật thịt có tiến hóa để xem mối nguy bằng nhỏ đi lại.",
|
| 154 |
+
"score2": 42.3,
|
| 155 |
+
"src3": "Nepohybující se držení těla pomáhá zvířatům zůstat bez povšimnutí a přitom být ostražitá.",
|
| 156 |
+
"mt3": "Нерухома поза допомагає тваринам залишатися непоміченими, залишаючись при цьому пильними.",
|
| 157 |
+
"score3": 90.1,
|
| 158 |
+
"src4": "Animals like deer rely on peripheral vision to detect danger.",
|
| 159 |
+
"mt4": "Los animales como los ciervos confían en la visión periférica para detectar el peligro.",
|
| 160 |
+
"score4": 85.0
|
| 161 |
}
|
| 162 |
]
|
| 163 |
+
|
| 164 |
print("scores", model.predict(data, batch_size=8, gpus=1).scores)
|
| 165 |
```
|
| 166 |
Outputs:
|
| 167 |
```
|
| 168 |
+
scores [98.42459869384766, 94.70307922363281, 91.14827728271484]
|
| 169 |
```
|
| 170 |
|
| 171 |
You can use a readily-available training data to do the on-the-fly retrieval.
|
|
|
|
| 192 |
data_retrieved = comet_poly.retrieval.retrieve_from_kb(
|
| 193 |
data=data,
|
| 194 |
data_kb=data_kb,
|
| 195 |
+
k=3, # this model takes three in-context examples
|
| 196 |
prevent_hardmatch=False,
|
| 197 |
key="src",
|
| 198 |
)
|