YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Model Card for iFLYTEK Spark Chemistry-X1-13B

Model Introduction

iFLYTEK Spark Chemistry-X1-13B is a chemistry-specialized large language model developed by the iFLYTEK team. Fine-tuned from the iFLYTEK Spark-X1 foundation model on diverse chemical task datasets, it demonstrates exceptional proficiency in solving complex chemical problems while maintaining strong general capabilities. The model achieves strong performance across chemistry-related benchmarks and shows clear advantages over leading general-purpose models on most evaluation metrics.

Key Features

  • Deep Reasoning Architecture: Unified framework combining long Chain-of-Thought (CoT) with dual-process theory, supporting both fast (reactive) and slow (deliberative) thinking modes

  • Hybrid Training Stability: Novel attention masking mechanisms decouple training phases for different reasoning modes, preventing interference between data distributions

  • Chemical Domain Enhancement: Multi-stage optimization for specialized tasks including:

    • Advanced knowledge Q&A
    • Chemical name conversion
    • Molecular property prediction

Model Summary

Parameter Value
Total Parameters 13B
Context Length 32K
Window Length 32K
Number of Layers 40
Attention Hidden Dim 5120
Attention Heads 40
Vocabulary Size 130K
Attention Mechanism GQA
Activation Function GeLU

Evaluation Results

*Bold = Global SOTA

Task Metric Spark Chemistry-X1-13B DeepSeek-R1 Gemini 2.5 pro GPT-4.1 O3-mini
Advanced Knowledge Q&A Acc 84.00 77.00 64.00 76.00 80.00
Name Conversion Acc 71.00 6.00 15.00 4.00 6.00
Property Prediction Acc 85.33 41.73 51.19 51.66 67.58

Evaluation Notes:

  1. All results show zero-shot performance averages
  2. Consistent evaluation protocol applied across all models
  3. DeepSeek-R1, Gemini 2.5 Pro, GPT-4.1, and O3-mini were evaluated using Chain-of-Thought (CoT) reasoning with API verification
  4. Spark Chemistry-X1-13B was evaluated using Chain-of-Thought (CoT) reasoning in a local environment on NVIDIA A800 80GB GPUs
  5. The evaluation dataset was self-constructed

Usage

requirement

cd /path/to/Spark-Chemistry-X1-13B
# We recommend using Python 3.10
pip install -r requirements.txt
pip install .

Quickstart

from modelscope import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model_name = "iflytek/Spark-Chemistry-X1-13B"
tokenizer = AutoTokenizer.from_pretrained(model_name,trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float32,
    device_map="auto",
    trust_remote_code=True
)
# Reactive
chat_history = [
  {
    "role" : "user",
    "content" : "请回答下列问题:高分子材料是否具有柔顺性主要决定于()的运动能力。\nA、主链链节\nB、侧基\nC、侧基内的官能团或原子?"
  }]

inputs = tokenizer.apply_chat_template(
    chat_history,
    tokenize=True,
    return_tensors="pt",
    add_generation_prompt=True
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=8192,
    top_k=1,
    do_sample=True,
    repetition_penalty=1.02,
    temperature=0.7,
    eos_token_id=5,
    pad_token_id=0,
)

response = tokenizer.decode(
    outputs[0][inputs.shape[1] :],
    skip_special_tokens=True
)
print(response)
from modelscope import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model_name = "iflytek/Spark-Chemistry-X1-13B"
tokenizer = AutoTokenizer.from_pretrained(model_name,trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float32,
    device_map="auto",
    trust_remote_code=True
)
# Deliberative
chat_history = [
  {
    "role" : "system",
    "content" : "请你先深入剖析给出问题的关键要点与内在逻辑,生成思考过程,再根据思考过程回答给出问题。思考过程以<unused6>开头,在结尾处用<unused7>标注结束,<unused7>后为基于思考过程的回答内容"
  }
  ,
  {
    "role" : "user",
    "content" : "请回答下列问题:高分子材料是否具有柔顺性主要决定于()的运动能力。\nA、主链链节\nB、侧基\nC、侧基内的官能团或原子?"
  }]


inputs = tokenizer.apply_chat_template(
    chat_history,
    tokenize=True,
    return_tensors="pt",
    add_generation_prompt=True
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=8192,
    top_k=1,
    do_sample=True,
    repetition_penalty=1.02,
    temperature=0.7,
    eos_token_id=5,
    pad_token_id=0,
)

response = tokenizer.decode(
    outputs[0][inputs.shape[1] :],
    skip_special_tokens=True
)
print(response)

Optional: Convert FP32 Weights to BF16

The released weights of Spark Chemistry-X1-13B are stored in FP32 precision. For inference efficiency, users can optionally convert the weights into bfloat16 (BF16) format.

from modelscope import AutoModelForCausalLM
import torch

model_name = " /path_to/Spark-Chemistry-X1-13B"

# Load FP32 weights
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float32, # explicitly FP32
    device_map="auto",
    trust_remote_code=True
)

# Convert to BF16
model = model.to(torch.bfloat16)

#  Save BF16 weights for later fast loading
save_path = "./Spark-Chemistry-X1-13B-bf16"
model.save_pretrained(save_path, safe_serialization=True)

License Agreement

iFLYTEK Spark Chemistry-X1-13B is licensed under Apache 2.0.

Downloads last month
38
Safetensors
Model size
13B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support