YanoljaNEXT-EEVE-Rosetta-7B-2602

This is the 7B EEVE-Rosetta release at release/YanoljaNEXT-EEVE-Rosetta-7B-2602.

Model Name: release/YanoljaNEXT-EEVE-Rosetta-7B-2602
Base Model: ByteDance-Seed/Seed-X-PPO-7B
Architecture: MistralForCausalLM
Tokenizer: GemmaTokenizerFast (expanded vocabulary)
Vocab Size: 161696
Training Context Length: 8192
Max Position Embeddings: 32768 (architectural limit, not the training context length)

Model Description

This model is a fine-tuned decoder-only 7B translation model for structured inputs (JSON, YAML, XML), preserving original keys and schema.

This version is EEVE-Rosetta: we expanded the vocabulary of ByteDance-Seed/Seed-X-PPO-7B using a Gemma tokenizer family tokenizer (GemmaTokenizerFast), then trained with the Rosetta translation format.

Prompt Format

The chat template uses role tags:

instruction for system
source for user
translation for assistant

Special turn tokens:

<start_of_turn>
<end_of_turn>

How to use

You can use this model with the transformers library as follows:

import json
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "yanolja/YanoljaNEXT-EEVE-Rosetta-7B-2602"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained(model_id)

target_language = "Korean"
context = {
  "context": "Simple introduction about a tech company.",
  "tone": "Informative and helpful",
  "glossary": {
    "Yanolja NEXT": "야놀자넥스트",
    "travel industry": "여행 산업",
  }
}

system = [f"Translate the user's text to {target_language}."]
for key, value in context.items():
  key_pascal = key.capitalize()
  if isinstance(value, dict):
    system.append(f"{key_pascal}:")
    for f, t in value.items():
      system.append(f"- {f} -> {t}")
  else:
    system.append(f"{key_pascal}: {value}")

system.append("Output format: JSON")
system.append("Provide the final translation immediately without any other text.")

source = {
  "company_name": "Yanolja NEXT",
  "description": "Yanolja NEXT is a company that provides cutting-edge "
                 "technology for the global travel industry.",
}

messages = [
    {"role": "system", "content": "\n".join(system)},
    {"role": "user", "content": json.dumps(source, ensure_ascii=False)},
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
print(prompt)
# <bos><start_of_turn>instruction
# Translate the user's text to Korean.
# Context: Simple introduction about a tech company.
# Tone: Informative and helpful
# Glossary:
# - Yanolja NEXT -> 야놀자넥스트
# - travel industry -> 여행 산업
# Output format: JSON
# Provide the final translation immediately without any other text.<end_of_turn>
# <start_of_turn>source
# {"company_name": "Yanolja NEXT", "description": "Yanolja NEXT is a company that provides cutting-edge technology for the global travel industry."}<end_of_turn>
# <start_of_turn>translation

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
input_length = inputs["input_ids"].shape[1]

with torch.inference_mode():
    outputs = model.generate(
        **inputs,
        max_new_tokens=64,
    )

generated_tokens = outputs[0][input_length:]
translation = tokenizer.decode(generated_tokens, skip_special_tokens=True)

print(json.dumps(json.loads(translation), indent=2, ensure_ascii=False))
# {
#   "company_name": "야놀자넥스트",
#   "description": "야놀자넥스트는 글로벌 여행 산업에 최첨단 기술을 제공하는 회사입니다."
# }

The model outputs the final translation in the same structured format as the input (JSON, YAML, XML) when appropriate, or plain text for simple translations.

Intended Uses & Limitations

Intended use:

Structured translation for multilingual production pipelines.

Limitations:

May produce invalid structured output in some cases.
May produce repetitive text or partial translations on long contexts.
Quality can vary across language pairs and domain-specific terminology.

License

The base model ByteDance-Seed/Seed-X-PPO-7B is marked on Hugging Face with license openmdw, and its LICENSE file states OpenMDW-1.0 terms:

Use of this derivative should follow the base model license terms.

This release includes local compliance files:

LICENSE (OpenMDW-1.0 text)
NOTICE (origin and modification notice)
THIRD_PARTY_LICENSES.md (component-level license summary)

Citation

If you use this model, please consider citing:

@misc{yanolja2026yanoljanexteeverosetta7b,
  author = {Yanolja NEXT Co., Ltd.},
  title = {YanoljaNEXT-EEVE-Rosetta-7B-2602},
  year = {2026},
  publisher = {Hugging Face},
  journal = {Hugging Face repository},
  howpublished = {\\url{https://huggingface.co/yanolja/YanoljaNEXT-EEVE-Rosetta-7B-2602}}
}

References

This work utilizes several models and prior works. We would like to acknowledge the original authors for their valuable contributions to the field.

@misc{cheng2025seedxbuildingstrongmultilingual,
  title = {Seed-X: Building Strong Multilingual Translation LLM with 7B Parameters},
  author = {Shanbo Cheng and Yu Bao and Qian Cao and Luyang Huang and Liyan Kang and Zhicheng Liu and Yu Lu and Wenhao Zhu and Jingwen Chen and Zhichao Huang and Tao Li and Yifu Li and Huiying Lin and Sitong Liu and Ningxin Peng and Shuaijie She and Lu Xu and Nuo Xu and Sen Yang and Runsheng Yu and Yiming Yu and Liehao Zou and Hang Li and Lu Lu and Yuxuan Wang and Yonghui Wu},
  year = {2025},
  eprint = {2507.13618},
  archivePrefix = {arXiv},
  primaryClass = {cs.CL},
  url = {https://arxiv.org/abs/2507.13618}
}

@misc{gemma3,
  author = {Google},
  title = {Gemma 3},
  year = {2024},
  publisher = {Google DeepMind},
  howpublished = {\\url{https://deepmind.google/models/gemma/gemma-3/}}
}

@misc{jiang2023mistral7b,
  title = {Mistral 7B},
  author = {Albert Q. Jiang and Alexandre Sablayrolles and Arthur Mensch and others},
  year = {2023},
  eprint = {2310.06825},
  archivePrefix = {arXiv},
  primaryClass = {cs.CL},
  url = {https://arxiv.org/abs/2310.06825}
}