Qwen2.5-7B-CelestialHarmony-1M / README.md

Adding Evaluation Results (#1)

fa5e023 verified 9 months ago

8.38 kB

	---
	license: mit
	library_name: transformers
	tags:
	- mergekit
	- merge
	base_model:
	- Qwen/Qwen2.5-7B-Instruct-1M
	- Sakalti/SJT-7B-1M
	- Triangle104/Q2.5-Instruct-1M_Harmony
	- bunnycore/Qwen2.5-7B-RRP-1M
	- huihui-ai/Qwen2.5-7B-Instruct-1M-abliterated
	model-index:
	- name: Qwen2.5-7B-CelestialHarmony-1M
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: IFEval (0-Shot)
	type: HuggingFaceH4/ifeval
	args:
	num_few_shot: 0
	metrics:
	- type: inst_level_strict_acc and prompt_level_strict_acc
	value: 59.44
	name: strict accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: BBH (3-Shot)
	type: BBH
	args:
	num_few_shot: 3
	metrics:
	- type: acc_norm
	value: 34.51
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MATH Lvl 5 (4-Shot)
	type: hendrycks/competition_math
	args:
	num_few_shot: 4
	metrics:
	- type: exact_match
	value: 33.01
	name: exact match
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GPQA (0-shot)
	type: Idavidrein/gpqa
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 9.17
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MuSR (0-shot)
	type: TAUR-Lab/MuSR
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 16.74
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU-PRO (5-shot)
	type: TIGER-Lab/MMLU-Pro
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 37.63
	name: accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M
	name: Open LLM Leaderboard
	---
	# ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M

	ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M is a custom merged language model based on Qwen2.5-7B with enhanced reasoning, roleplaying, and long-context capabilities. This model supports up to 1 million token context lengths, making it ideal for ultra-long text processing, deep reasoning tasks, and immersive roleplay interactions.

	Quants are availble in GGUF format, provided by [mradermacher](https://huggingface.co/mradermacher).
	1. [GGUF](https://huggingface.co/mradermacher/Qwen2.5-7B-CelestialHarmony-1M-GGUF)
	2. [imatrix GGUF](https://huggingface.co/mradermacher/Qwen2.5-7B-CelestialHarmony-1M-i1-GGUF)
	---

	## 🔧 Model Details
	- Base Model: `Qwen/Qwen2.5-7B-Instruct-1M`
	- Models Used in Merge:
	- `Qwen/Qwen2.5-7B-Instruct-1M`
	- `bunnycore/Qwen2.5-7B-RRP-1M`
	- `Triangle104/Q2.5-Instruct-1M_Harmony`
	- `Sakalti/SJT-7B-1M`
	- `huihui-ai/Qwen2.5-7B-Instruct-1M-abliterated`
	- Merge Method: `MODEL_STOCK` (Optimized layer-wise weight averaging)

	---

	## 📖 Overview
	Qwen2.5-7B-CelestialHarmony-1M enhances the Qwen2.5-7B series with a fine-tuned balance of roleplaying dynamics, structured reasoning, and long-context memory. The model is particularly well-suited for:
	- Roleplaying 🧝‍♂️: Immersive character-based storytelling with deep contextual awareness.
	- Reasoning & Thought Processing 🧠: Capable of structured logical thinking, especially when prompted with `<think>` tags.
	- Ultra-Long Context Handling 📜: Efficient processing of sequences up to 1,010,000 tokens using optimized sparse attention.

	---

	## ⚙️ Technical Specifications
	\| Specification \| Value \|
	\|--------------\|---------\|
	\| Model Type \| Causal Language Model \|
	\| Parameters \| 7.61B \|
	\| Non-Embedding Parameters \| 6.53B \|
	\| Layers \| 28 \|
	\| Attention Heads (GQA) \| 28 (Q), 4 (KV) \|
	\| Max Context Length \| 1,010,000 tokens \|
	\| Max Generation Length \| 8,192 tokens \|
	\| Merge Method \| Model Stock\|

	---

	## 🔬 Merging Details
	This model was merged using the Model Stock method, which optimally averages weights from multiple fine-tuned models to create a more efficient, balanced, and performant model.

	### Merge YAML Configuration
	```yaml
	base_model: Qwen/Qwen2.5-7B-Instruct-1M
	dtype: bfloat16
	merge_method: model_stock
	models:
	- model: Qwen/Qwen2.5-7B-Instruct-1M
	- model: Triangle104/Q2.5-Instruct-1M_Harmony
	- model: Sakalti/SJT-7B-1M
	- model: bunnycore/Qwen2.5-7B-RRP-1M
	- model: huihui-ai/Qwen2.5-7B-Instruct-1M-abliterated
	tokenizer_source: Qwen/Qwen2.5-7B-Instruct-1M
	```

	---

	## 🚀 Quickstart
	### Install Required Packages
	Ensure you have the latest `transformers` library installed:
	```bash
	pip install transformers torch accelerate
	```

	### Load and Use the Model
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M"

	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype="auto",
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained(model_name)

	prompt = "Tell me a short story about an ancient celestial warrior."
	messages = [
	{"role": "system", "content": "You are a wise celestial storyteller."},
	{"role": "user", "content": prompt}
	]
	text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

	generated_ids = model.generate(**model_inputs, max_new_tokens=512)
	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

	print(response)
	```

	---

	## ⚡ Optimized Deployment with vLLM
	For long-context inference, use vLLM:
	```bash
	git clone -b dev/dual-chunk-attn [email protected]:QwenLM/vllm.git
	cd vllm
	pip install -e . -v
	```
	Run the model:
	```bash
	vllm serve ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M \
	--tensor-parallel-size 4 \
	--max-model-len 1010000 \
	--enable-chunked-prefill --max-num-batched-tokens 131072 \
	--enforce-eager \
	--max-num-seqs 1
	```

	---

	## 🎯 Model Capabilities
	✅ Roleplay & Storytelling – Designed for engaging interactions.
	✅ Long-Context Awareness – Handles texts up to 1M tokens.
	✅ Logical Thinking & Reasoning – Supports `<think>` tag to enhance thought structuring.
	✅ Optimized Merge Strategy – Uses `Model Stock` for superior generalization.

	---

	## 📜 Acknowledgments
	This model is built on top of Qwen2.5-7B, with contributions from bunnycore, Triangle104, and Sakalti, leveraging the Model Stock merging methodology.

	For further details, see:
	- 📄 [Qwen2.5-7B Technical Report](https://arxiv.org/abs/2501.15383)
	- 📖 [MergeKit Documentation](https://github.com/mlfoundations/mergekit)
	- 🚀 [vLLM for Long-Context Inference](https://github.com/QwenLM/vllm)

	---
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/ZeroXClem__Qwen2.5-7B-CelestialHarmony-1M-details)

	\| Metric \|Value\|
	\|-------------------\|----:\|
	\|Avg. \|31.75\|
	\|IFEval (0-Shot) \|59.44\|
	\|BBH (3-Shot) \|34.51\|
	\|MATH Lvl 5 (4-Shot)\|33.01\|
	\|GPQA (0-shot) \| 9.17\|
	\|MuSR (0-shot) \|16.74\|
	\|MMLU-PRO (5-shot) \|37.63\|