Model Card for iFLYTEK Spark Chemistry-X1-13B
Model Introduction
iFLYTEK Spark Chemistry-X1-13B is a chemistry-specialized large language model developed by the iFLYTEK team. Fine-tuned from the iFLYTEK Spark-X1 foundation model on diverse chemical task datasets, it demonstrates exceptional proficiency in solving complex chemical problems while maintaining strong general capabilities. The model achieves strong performance across chemistry-related benchmarks and shows clear advantages over leading general-purpose models on most evaluation metrics.
Key Features
Deep Reasoning Architecture: Unified framework combining long Chain-of-Thought (CoT) with dual-process theory, supporting both fast (reactive) and slow (deliberative) thinking modes
Hybrid Training Stability: Novel attention masking mechanisms decouple training phases for different reasoning modes, preventing interference between data distributions
Chemical Domain Enhancement: Multi-stage optimization for specialized tasks including:
- Advanced knowledge Q&A
- Chemical name conversion
- Molecular property prediction
Model Summary
| Parameter | Value |
|---|---|
| Total Parameters | 13B |
| Context Length | 32K |
| Window Length | 32K |
| Number of Layers | 40 |
| Attention Hidden Dim | 5120 |
| Attention Heads | 40 |
| Vocabulary Size | 130K |
| Attention Mechanism | GQA |
| Activation Function | GeLU |
Evaluation Results
*Bold = Global SOTA
| Task | Metric | Spark Chemistry-X1-13B | DeepSeek-R1 | Gemini 2.5 pro | GPT-4.1 | O3-mini |
|---|---|---|---|---|---|---|
| Advanced Knowledge Q&A | Acc | 84.00 | 77.00 | 64.00 | 76.00 | 80.00 |
| Name Conversion | Acc | 71.00 | 6.00 | 15.00 | 4.00 | 6.00 |
| Property Prediction | Acc | 85.33 | 41.73 | 51.19 | 51.66 | 67.58 |
Evaluation Notes:
- All results show zero-shot performance averages
- Consistent evaluation protocol applied across all models
- DeepSeek-R1, Gemini 2.5 Pro, GPT-4.1, and O3-mini were evaluated using Chain-of-Thought (CoT) reasoning with API verification
- Spark Chemistry-X1-13B was evaluated using Chain-of-Thought (CoT) reasoning in a local environment on NVIDIA A800 80GB GPUs
- The evaluation dataset was self-constructed
Usage
requirement
cd /path/to/Spark-Chemistry-X1-13B
# We recommend using Python 3.10
pip install -r requirements.txt
pip install .
Quickstart
from modelscope import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model_name = "iflytek/Spark-Chemistry-X1-13B"
tokenizer = AutoTokenizer.from_pretrained(model_name,trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float32,
device_map="auto",
trust_remote_code=True
)
# Reactive
chat_history = [
{
"role" : "user",
"content" : "请回答下列问题:高分子材料是否具有柔顺性主要决定于()的运动能力。\nA、主链链节\nB、侧基\nC、侧基内的官能团或原子?"
}]
inputs = tokenizer.apply_chat_template(
chat_history,
tokenize=True,
return_tensors="pt",
add_generation_prompt=True
).to(model.device)
outputs = model.generate(
inputs,
max_new_tokens=8192,
top_k=1,
do_sample=True,
repetition_penalty=1.02,
temperature=0.7,
eos_token_id=5,
pad_token_id=0,
)
response = tokenizer.decode(
outputs[0][inputs.shape[1] :],
skip_special_tokens=True
)
print(response)
from modelscope import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model_name = "iflytek/Spark-Chemistry-X1-13B"
tokenizer = AutoTokenizer.from_pretrained(model_name,trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float32,
device_map="auto",
trust_remote_code=True
)
# Deliberative
chat_history = [
{
"role" : "system",
"content" : "请你先深入剖析给出问题的关键要点与内在逻辑,生成思考过程,再根据思考过程回答给出问题。思考过程以<unused6>开头,在结尾处用<unused7>标注结束,<unused7>后为基于思考过程的回答内容"
}
,
{
"role" : "user",
"content" : "请回答下列问题:高分子材料是否具有柔顺性主要决定于()的运动能力。\nA、主链链节\nB、侧基\nC、侧基内的官能团或原子?"
}]
inputs = tokenizer.apply_chat_template(
chat_history,
tokenize=True,
return_tensors="pt",
add_generation_prompt=True
).to(model.device)
outputs = model.generate(
inputs,
max_new_tokens=8192,
top_k=1,
do_sample=True,
repetition_penalty=1.02,
temperature=0.7,
eos_token_id=5,
pad_token_id=0,
)
response = tokenizer.decode(
outputs[0][inputs.shape[1] :],
skip_special_tokens=True
)
print(response)
Optional: Convert FP32 Weights to BF16
The released weights of Spark Chemistry-X1-13B are stored in FP32 precision. For inference efficiency, users can optionally convert the weights into bfloat16 (BF16) format.
from modelscope import AutoModelForCausalLM
import torch
model_name = " /path_to/Spark-Chemistry-X1-13B"
# Load FP32 weights
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float32, # explicitly FP32
device_map="auto",
trust_remote_code=True
)
# Convert to BF16
model = model.to(torch.bfloat16)
# Save BF16 weights for later fast loading
save_path = "./Spark-Chemistry-X1-13B-bf16"
model.save_pretrained(save_path, safe_serialization=True)
License Agreement
iFLYTEK Spark Chemistry-X1-13B is licensed under Apache 2.0.
- Downloads last month
- 38