You can caption files like this (ms-swift formatting):

{"messages": [{"role": "user", "content": "<image>Write a detailed caption for this image, maximum 1000 words. Mention pivotal elements—people, objects, scenery—using confident, definite language. Focus on concrete details like color, shape, and spatial relationships. Use vulgar descriptions of what the characters are doing."}], "images": ["josman_images/97fecadaa022a4bf7a4947edc1f3260a.webp"]}

About the model: The model has been fine tuned on synthetic reasoning data with 500 curated high quality captions, where the reasoning includes contrastive 'fixing its own mistakes'. It was also trained on 9000 image-tag pairs with synthetic chain of thought. The chain of thought was generated by Gemini 3.1 Flash (tags) and Pro (captions). All tags + captions were very high quality.

What is in this repo: Qwen3.5 9B fused with 32 rank LoRA + fully finetuned projector (visual.merger) (fused twice; trained on my captioning v1 rank 32 fused base model)

Downloads last month
75
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for oldhag88/qwen3.5-9b-nsfw-captioning-v2

Finetuned
Qwen/Qwen3.5-9B
Finetuned
(99)
this model
Quantizations
2 models