NaNovel-35B-A3B

NaNovel-35B-A3B is the sparse Mixture-of-Experts entry in the Novelist series. It targets users who want a larger writing model with MoE-style capacity while keeping per-token activation lower than a fully dense model of similar total size.

Novelist Series

Model Overview

NaNovel-35B-A3B was fine-tuned on Dxniz/Novelist-CoT with the repository's dedicated MoE training pipeline. The goal is to combine creative-writing specialization with the routing behavior of the base Qwen3.5-35B-A3B architecture. This model is aimed at users exploring the upper end of the Novelist line and who specifically want the characteristics of the MoE base.

The local training code describes the base as a 35B-total model with roughly 3B active parameters per token. It is trained for long-context fiction work, narrative reasoning, and literary instruction following.

Evaluation Snapshot

Evaluation

This model was evaluated with the Dxniz/Novelist-Bench benchmark dataset.

The repository evaluation summaries show the following results for NaNovel-35B-A3B:

Overall evaluation results:

Overall evaluation results

Detailed evaluation results:

Detailed evaluation results

In the current repository measurements, this model improves when explicit thinking is enabled and shows its strongest relative behavior in grammar editing, worldbuilding, style control, and character-driven prose. The 27B dense model still scores higher overall in these summaries, so this card should be read as an MoE-focused option rather than the default best choice for every user.

Recommended Use

  • Long-context scene and chapter generation
  • Users specifically interested in the Qwen3.5 MoE base
  • Prose generation where style and atmosphere matter more than benchmark-leading balance
  • Creative experiments comparing dense and MoE Novelist behavior

Limitations

  • Current local evaluation summaries place it below NaNovel-27B overall
  • Hardware and software expectations are more demanding than the 9B and 27B dense variants
  • MoE deployment details vary by stack; inference setup should be validated in your environment
  • Generated text should be reviewed before publication or downstream automation

Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Dxniz/NaNovel-35B-A3B"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are Novelist, a creative writing assistant."},
    {"role": "user", "content": "Write a baroque dark fantasy scene set inside a flooded cathedral archive."},
]

inputs = tokenizer.apply_chat_template(
    messages,
    return_tensors="pt",
    add_generation_prompt=True,
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=1600,
    temperature=0.7,
    top_p=0.85,
    top_k=20,
    do_sample=True,
)

print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))

License

Apache 2.0, consistent with the base model license.

Downloads last month
61
Safetensors
Model size
36B params
Tensor type
BF16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Dxniz/NaNovel-35B-A3B

Finetuned
(48)
this model
Quantizations
2 models

Dataset used to train Dxniz/NaNovel-35B-A3B

Collection including Dxniz/NaNovel-35B-A3B