Clover-Hill
/

MemoryDecoder-Qwen-law

Model card Files Files and versions

MemoryDecoder-Qwen-law / README.md

Clover-Hill's picture

Update README.md

8d8f794 verified about 1 month ago

|

1.7 kB

	---
	license: apache-2.0
	language:
	- en
	base_model:
	- Qwen/Qwen2.5-0.5B
	---

	## Model Description

	This Memory Decoder model is trained on the Law domain and can be adapted to enhance any model in the Qwen2 and Qwen2.5 families.

	Paper: [Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models](https://www.arxiv.org/abs/2508.09874)

	GitHub: [https://github.com/LUMIA-Group/MemoryDecoder](https://github.com/LUMIA-Group/MemoryDecoder/tree/main)

	## Training & Evaluation Data

	Law Domain Dataset: [AsyLex](https://huggingface.co/datasets/clairebarale/AsyLex)

	Test Split: [MemoryDecoder-domain-data](https://huggingface.co/datasets/Clover-Hill/MemoryDecoder-domain-data)

	## Performance Results

	### Qwen2 Family

	\| Model \| Base Model \| Base + MemDec \|
	\|-------\|------------\|---------------\|
	\| Qwen2-0.5B \| 10.23 \| 4.57 \|
	\| Qwen2-1.5B \| 7.69 \| 4.32 \|
	\| Qwen2-7B \| 5.92 \| 4.00 \|
	\| Qwen2-72B \| 4.84 \| 3.69 \|

	### Qwen2.5 Family

	\| Model \| Base Model \| Base + MemDec \|
	\|-------\|------------\|---------------\|
	\| Qwen2.5-0.5B \| 9.86 \| 4.57 \|
	\| Qwen2.5-1.5B \| 7.42 \| 4.29 \|
	\| Qwen2.5-3B \| 6.68 \| 4.16 \|
	\| Qwen2.5-7B \| 5.94 \| 4.01 \|
	\| Qwen2.5-14B \| 5.35 \| 3.86 \|
	\| Qwen2.5-32B \| 5.18 \| 3.81 \|
	\| Qwen2.5-72B \| 4.84 \| 3.70 \|

	Perplexity scores on Law domain test set. Lower is better.

	## Citation

	```bibtex
	@article{cao2025memory,
	title={Memory decoder: A pretrained, plug-and-play memory for large language models},
	author={Cao, Jiaqi and Wang, Jiarui and Wei, Rubin and Guo, Qipeng and Chen, Kai and Zhou, Bowen and Lin, Zhouhan},
	journal={arXiv preprint arXiv:2508.09874},
	year={2025}
	}
	```

	## Contact

	For questions and support: [email protected]