NaNovel-35B-A3B
NaNovel-35B-A3B is the sparse Mixture-of-Experts entry in the Novelist series. It targets users who want a larger writing model with MoE-style capacity while keeping per-token activation lower than a fully dense model of similar total size.
Novelist Series
- Base models: Qwen3.5-9B, Qwen3.5-27B, Qwen3.5-35B-A3B
- Autoregressive models: NaNovel-9B, NaNovel-27B, NaNovel-35B-A3B
- Diffusion models coming soon.
Model Overview
NaNovel-35B-A3B was fine-tuned on Dxniz/Novelist-CoT with the repository's dedicated MoE training pipeline. The goal is to combine creative-writing specialization with the routing behavior of the base Qwen3.5-35B-A3B architecture. This model is aimed at users exploring the upper end of the Novelist line and who specifically want the characteristics of the MoE base.
The local training code describes the base as a 35B-total model with roughly 3B active parameters per token. It is trained for long-context fiction work, narrative reasoning, and literary instruction following.
Evaluation Snapshot
Evaluation
This model was evaluated with the Dxniz/Novelist-Bench benchmark dataset.
The repository evaluation summaries show the following results for NaNovel-35B-A3B:
Overall evaluation results:
Detailed evaluation results:
In the current repository measurements, this model improves when explicit thinking is enabled and shows its strongest relative behavior in grammar editing, worldbuilding, style control, and character-driven prose. The 27B dense model still scores higher overall in these summaries, so this card should be read as an MoE-focused option rather than the default best choice for every user.
Recommended Use
- Long-context scene and chapter generation
- Users specifically interested in the Qwen3.5 MoE base
- Prose generation where style and atmosphere matter more than benchmark-leading balance
- Creative experiments comparing dense and MoE Novelist behavior
Limitations
- Current local evaluation summaries place it below
NaNovel-27Boverall - Hardware and software expectations are more demanding than the 9B and 27B dense variants
- MoE deployment details vary by stack; inference setup should be validated in your environment
- Generated text should be reviewed before publication or downstream automation
Usage
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "Dxniz/NaNovel-35B-A3B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "You are Novelist, a creative writing assistant."},
{"role": "user", "content": "Write a baroque dark fantasy scene set inside a flooded cathedral archive."},
]
inputs = tokenizer.apply_chat_template(
messages,
return_tensors="pt",
add_generation_prompt=True,
).to(model.device)
outputs = model.generate(
inputs,
max_new_tokens=1600,
temperature=0.7,
top_p=0.85,
top_k=20,
do_sample=True,
)
print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
License
Apache 2.0, consistent with the base model license.
- Downloads last month
- 61

