Papers
arxiv:2404.18031

Quality Estimation with k-nearest Neighbors and Automatic Evaluation for Model-specific Quality Estimation

Published on Apr 27, 2024
Authors:
,
,

Abstract

A model-specific, unsupervised QE approach called $k$NN-QE uses $k$-nearest neighbors to estimate translation quality without human scores, and MetricX-23 is identified as the best automatic evaluation method.

AI-generated summary

Providing quality scores along with Machine Translation (MT) output, so-called reference-free Quality Estimation (QE), is crucial to inform users about the reliability of the translation. We propose a model-specific, unsupervised QE approach, termed kNN-QE, that extracts information from the MT model's training data using k-nearest neighbors. Measuring the performance of model-specific QE is not straightforward, since they provide quality scores on their own MT output, thus cannot be evaluated using benchmark QE test sets containing human quality scores on premade MT output. Therefore, we propose an automatic evaluation method that uses quality scores from reference-based metrics as gold standard instead of human-generated ones. We are the first to conduct detailed analyses and conclude that this automatic method is sufficient, and the reference-based MetricX-23 is best for the task.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2404.18031 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2404.18031 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2404.18031 in a Space README.md to link it from this page.

Collections including this paper 1