YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Model Card for iFLYTEK Spark Chemistry-X1-13B

Model Introduction

iFLYTEK Spark Chemistry-X1-13B is a chemistry-specialized large language model developed by the iFLYTEK team. Fine-tuned from the iFLYTEK Spark-X1 foundation model on diverse chemical task datasets, it demonstrates exceptional proficiency in solving complex chemical problems while maintaining strong general capabilities. The model achieves strong performance across chemistry-related benchmarks and shows clear advantages over leading general-purpose models on most evaluation metrics.

Key Features

Deep Reasoning Architecture: Unified framework combining long Chain-of-Thought (CoT) with dual-process theory, supporting both fast (reactive) and slow (deliberative) thinking modes
Hybrid Training Stability: Novel attention masking mechanisms decouple training phases for different reasoning modes, preventing interference between data distributions
Chemical Domain Enhancement: Multi-stage optimization for specialized tasks including:
- Advanced knowledge Q&A
- Chemical name conversion
- Molecular property prediction

Model Summary

Parameter	Value
Total Parameters	13B
Context Length	32K
Window Length	32K
Number of Layers	40
Attention Hidden Dim	5120
Attention Heads	40
Vocabulary Size	130K
Attention Mechanism	GQA
Activation Function	GeLU

Evaluation Results

*Bold = Global SOTA

Task	Metric	Spark Chemistry-X1-13B	DeepSeek-R1	Gemini 2.5 pro	GPT-4.1	O3-mini
Advanced Knowledge Q&A	Acc	84.00	77.00	64.00	76.00	80.00
Name Conversion	Acc	71.00	6.00	15.00	4.00	6.00
Property Prediction	Acc	85.33	41.73	51.19	51.66	67.58

Evaluation Notes:

All results show zero-shot performance averages
Consistent evaluation protocol applied across all models
DeepSeek-R1, Gemini 2.5 Pro, GPT-4.1, and O3-mini were evaluated using Chain-of-Thought (CoT) reasoning with API verification
Spark Chemistry-X1-13B was evaluated using Chain-of-Thought (CoT) reasoning in a local environment on NVIDIA A800 80GB GPUs
The evaluation dataset was self-constructed

Usage

requirement

cd /path/to/Spark-Chemistry-X1-13B
# We recommend using Python 3.10
pip install -r requirements.txt
pip install .

Quickstart

from modelscope import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model_name = "iflytek/Spark-Chemistry-X1-13B"
tokenizer = AutoTokenizer.from_pretrained(model_name,trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float32,
    device_map="auto",
    trust_remote_code=True
)
# Reactive
chat_history = [
  {
    "role" : "user",
    "content" : "请回答下列问题:高分子材料是否具有柔顺性主要决定于()的运动能力。\nA、主链链节\nB、侧基\nC、侧基内的官能团或原子?"
  }]

inputs = tokenizer.apply_chat_template(
    chat_history,
    tokenize=True,
    return_tensors="pt",
    add_generation_prompt=True
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=8192,
    top_k=1,
    do_sample=True,
    repetition_penalty=1.02,
    temperature=0.7,
    eos_token_id=5,
    pad_token_id=0,
)

response = tokenizer.decode(
    outputs[0][inputs.shape[1] :],
    skip_special_tokens=True
)
print(response)

from modelscope import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model_name = "iflytek/Spark-Chemistry-X1-13B"
tokenizer = AutoTokenizer.from_pretrained(model_name,trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float32,
    device_map="auto",
    trust_remote_code=True
)
# Deliberative
chat_history = [
  {
    "role" : "system",
    "content" : "请你先深入剖析给出问题的关键要点与内在逻辑,生成思考过程,再根据思考过程回答给出问题。思考过程以<unused6>开头,在结尾处用<unused7>标注结束,<unused7>后为基于思考过程的回答内容"
  }
  ,
  {
    "role" : "user",
    "content" : "请回答下列问题:高分子材料是否具有柔顺性主要决定于()的运动能力。\nA、主链链节\nB、侧基\nC、侧基内的官能团或原子?"
  }]


inputs = tokenizer.apply_chat_template(
    chat_history,
    tokenize=True,
    return_tensors="pt",
    add_generation_prompt=True
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=8192,
    top_k=1,
    do_sample=True,
    repetition_penalty=1.02,
    temperature=0.7,
    eos_token_id=5,
    pad_token_id=0,
)

response = tokenizer.decode(
    outputs[0][inputs.shape[1] :],
    skip_special_tokens=True
)
print(response)

Optional: Convert FP32 Weights to BF16

The released weights of Spark Chemistry-X1-13B are stored in FP32 precision. For inference efficiency, users can optionally convert the weights into bfloat16 (BF16) format.

from modelscope import AutoModelForCausalLM
import torch

model_name = " /path_to/Spark-Chemistry-X1-13B"

# Load FP32 weights
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float32, # explicitly FP32
    device_map="auto",
    trust_remote_code=True
)

# Convert to BF16
model = model.to(torch.bfloat16)

#  Save BF16 weights for later fast loading
save_path = "./Spark-Chemistry-X1-13B-bf16"
model.save_pretrained(save_path, safe_serialization=True)

License Agreement

iFLYTEK Spark Chemistry-X1-13B is licensed under Apache 2.0.

Downloads last month: 38

Safetensors

Model size

13B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support