You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

We released the suite of models we trained as part of our work on scaling laws of decoder-only machine translation systems. This work has been published in WMT24 and is available here.

These models have been trained on a mixture of general and financial sentences on 11 language directions. They support 8 languages (English, French, German, Italian, Spanish, Dutch, Swedish and Portuguese) as well as 9 domains (general + 8 financial subdomains). They are not tailored for document-level translation.

A running demo of these models is available on our dedicated space.

Evaluation

The below table details the performance of our models on general domain translation.

Model	BLEU	COMET	COMET-Kiwi
FinTranslate-70M	29.62	81.31	80.72
FinTranslate-160M	32.43	84.00	83.45
FinTranslate-410M	33.60	84.81	84.14
FinTranslate-Bronze	34.08	85.10	84.35
FinTranslate-Silver	34.42	85.10	84.33
FinTranslate-Gold	36.07	85.88	84.82

Llama3.1 8B	30.43	84.82	84.47
Mistral 7B	23.26	80.08	82.29
Tower 7B	33.50	85.91	85.02

The below table details the performance of our models on financial translation.

Model	BLEU	COMET	COMET-Kiwi
FinTranslate-70M	44.63	86.95	80.88
FinTranslate-160M	49.02	88.27	81.80
FinTranslate-410M	50.85	88.64	81.73
FinTranslate-Bronze	52.00	88.85	81.71
FinTranslate-Silver	53.28	89.98	81.61
FinTranslate-Gold	58.34	89.62	81.35

Llama 3.1 8B	34.99	84.42	81.75
Mistral 7B	38.93	76.52	76.17
Tower 7B	38.93	86.49	82.66

How to use it

from transformers import AutoTokenizer, AutoModelForCausalLM

LANGUAGES = ["en", "de", "es", "fr", "it", "nl", "sv", "pt"]
DOMAINS = {
    "Asset manangement": "am",
    "Annual report": "ar",
    "Corporate action": "corporateAction",
    "Equity research": "equi",
    "Fund fact sheet": "ffs",
    "Kiid": "kiid",
    "Life insurance": "lifeInsurance",
    "Regulatory": "regulatory",
    "General": "general",
}


def language_token(lang):
    return f"<lang_{lang}>"


def domain_token(dom):
    return f"<dom_{dom}>"


def format_input(src, tgt_lang, src_lang, domain):
    assert tgt_lang in LANGUAGES
    tgt_lang_token = language_token(tgt_lang)
    # Please read our paper to understand why we need to prefix the input with <eos>
    base_input = f"<eos>{src}</src>{tgt_lang_token}"
    if src_lang is None:
        return base_input
    else:
        assert src_lang in LANGUAGES
        src_lang_token = language_token(src_lang)
        base_input = f"{base_input}{src_lang_token}"
    if domain is None:
        return base_input
    else:
        domain = DOMAINS.get(domain, "general")
        dom_token = domain_token(domain)
        base_input = f"{base_input}{dom_token}"
    return base_input


model_id = "DragonLLM/FinTranslate-410M"
model = AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)

source_sentence = "Dragon LLM est une entreprise française spécialisé dans le domaine de l'IA générative."
formatted_sentence = format_input(source_sentence, "en", "fr", "General")
inputs = tokenizer(formatted_sentence, return_tensors="pt", return_token_type_ids=False)
outputs = model.generate(**inputs, max_new_tokens=64)

input_size = inputs["input_ids"].size(1)
translated_sentence = tokenizer.decode(
    outputs[0, input_size:], skip_special_tokens=True
)
print(translated_sentence)
# Dragon LLM is a French company specialized in the field of generative AI.

Citing this work

If you use this model in your work, please cite it as:

@inproceedings{caillaut-etal-2024-scaling,
    title = "Scaling Laws of Decoder-Only Models on the Multilingual Machine Translation Task",
    author = {Caillaut, Ga{\"e}tan  and
      Nakhl{\'e}, Mariam  and
      Qader, Raheel  and
      Liu, Jingshu  and
      Barth{\'e}lemy, Jean-Gabriel},
    editor = "Haddow, Barry  and
      Kocmi, Tom  and
      Koehn, Philipp  and
      Monz, Christof",
    booktitle = "Proceedings of the Ninth Conference on Machine Translation",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.wmt-1.124/",
    doi = "10.18653/v1/2024.wmt-1.124",
    pages = "1318--1331"
}

Downloads last month: 2

Safetensors

Model size

0.5B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including DragonLLM/FinTranslate-410M

Financial machine translation models

Collection

Our decoder only financial translation models • 8 items • Updated Mar 25