ToastyPigeon
/

Qwen3.5-27B-Heretic-Marvin-V1

anti-repetition

creative-writing

Model card Files Files and versions

Qwen3.5-27B-Heretic-Marvin-V1

A creative writing style fine-tune of Qwen3.5-27B with anti-repetition DPO and uncensored behavior.

Model Stack

Base: Qwen/Qwen3.5-27B
Uncensored: llmfan46/Qwen3.5-27B-heretic-v2 — refusal suppression
Anti-repetition: DPO training (611 preference pairs, 1 epoch) to reduce degenerate repetition loops
Style SFT: Marvin style bible fine-tune (4478 samples, 17.2M tokens, 1 epoch) with 1.5x LoRA scaling for stronger style influence

Training Details

Anti-Repetition DPO

QLoRA r=32, alpha=16, RSLoRA
611 chosen/rejected pairs, 1 epoch
Eliminates catastrophic repetition while preserving base model capabilities

Style SFT

QLoRA r=32, alpha=16, RSLoRA
4478 samples, ~17.2M tokens, 1 epoch, 6144 context
LoRA merged at 1.5x scaling (alpha effectively 24) for stronger style transfer
Hardware: 2× RTX 3090

Key Properties

Uncensored/unrefused (from heretic-v2 base)
Anti-repetition (from DPO stage)
Distinctive creative writing voice — wry, observational, sensory-rich prose
Technical capability preserved

Downloads last month: 25

Safetensors

Model size

27B params

Tensor type

BF16

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ToastyPigeon/Qwen3.5-27B-Heretic-Marvin-V1

Base model

Qwen/Qwen3.5-27B

Finetuned

llmfan46/Qwen3.5-27B-heretic-v2

Finetuned

(2)

this model

Finetunes

Quantizations