---
library_name: transformers
license: apache-2.0
datasets:
- wmt/wmt14
---

# Quick start guide
To use this models, follow the snippet below:
```python
from transformers import AutoModelForMaskedLM

# model_config_overrides = {}  # Use this to optionally override config parameters
model = AutoModelForMaskedLM.from_pretrained(
    "kuleshov-group/e2d2-wmt",
    trust_remote_code=True,
    # **model_config_overrides,
)
```

# Model details
- Trained from scratch on [`wmt/wmt14`](https://huggingface.co/datasets/wmt/wmt14)
- Qwen3 tokenizer: [`Qwen/Qwen3-0.6B-Base`](https://huggingface.co/Qwen/Qwen3-0.6B-Base)
- Block diffusion parameterization, with block size 4

See the project site for more details and link to the paper and code: https://m-arriola.com/e2d2/

# Citation

```
@inproceedings{
arriola2025e2d2,
title={Encoder-Decoder Diffusion Language Models for Efficient Training and Inference},
author={Marianne Arriola and Yair Schiff and Hao Phung and Aaron Gokaslan and Volodymyr Kuleshov},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
url={https://arxiv.org/abs/2510.22852}
}
```