qwen3_4b_instruct_2507_sft_v1 / README.md

Chouoftears

Upload safetensors checkpoint

7bd77c9 verified 22 days ago

preview code

raw

history blame contribute delete

1.67 kB

metadata

license: other
base_model: Qwen/Qwen3-4B-Instruct-2507
datasets:
  - custom
library_name: transformers
tags:
  - sft
  - instruction-tuning
  - qwen
inference: false

qwen3_4b_instruct_2507_sft_v1

This repository contains the supervised fine-tuning (SFT) checkpoint for a Qwen3 4B Instruct model trained with DeepSpeed ZeRO-3. The weights have been consolidated and exported to the Hugging Face safetensors format for easier deployment.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "Chouoftears/qwen3_4b_instruct_2507_sft_v1"
tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(repo_id, trust_remote_code=True)

Training

Base model: Qwen/Qwen3-4B-Instruct-2507
Framework: transformers==4.56.2
Optimization: DeepSpeed ZeRO Stage-3, bf16
SFT run name: qwen3-4B-Instruct-2507-toucan-sft-3ep
Max sequence length: 262,144 tokens (per config)

Refer to training_args.bin in the original run directory for the full trainer configuration.

Files

model-0000X-of-00004.safetensors: model weights shards
model.safetensors.index.json: weight index map
config.json / generation_config.json: architecture and generation defaults
Tokenizer artifacts: tokenizer.json, tokenizer_config.json, vocab.json, merges.txt, special_tokens_map.json, added_tokens.json
chat_template.jinja: conversation formatting used during SFT

Limitations

This checkpoint inherits limitations from the base Qwen3 model and SFT data. Review and align with your downstream safety and compliance requirements before deployment.