🌟 项目概述

QwenoV3 结合了强大的 DINOv3 视觉编码器和高效的 Qwen3 语言模型,构建了一个轻量级但功能强大的多模态模型。它能够理解图像和文本,并就图像内容进行流畅的对话。

✨ 主要特性

  • 强大的视觉理解: 采用 DINOv3-ViT-L 作为视觉骨干,能够从图像中提取丰富的语义特征。
  • 高效的语言生成: 基于 Qwen3-0.6B 语言模型,具备出色的对话和指令遵循能力。
  • 轻量级设计: 总参数量仅1B,便于部署和研究。
  • Streamlit Web界面: 提供基于 Streamlit 的交互式 Web UI,支持模型切换、参数调整和图像上传。

推理使用

from transformers import AutoModelForCausalLM, AutoConfig
from transformers.image_utils import load_image
from Qwenov3Config import Qwenov3Config, Qwenov3, Qwenov3Processor
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model_path = 'TianYeZ1214/Qwenov3'
AutoConfig.register("Qwenov3", Qwenov3Config)
AutoModelForCausalLM.register(Qwenov3Config, Qwenov3)

model = AutoModelForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, dtype=torch.bfloat16,
                                             trust_remote_code=True, attn_implementation="flash_attention_2").to(device)
processor = Qwenov3Processor(image_processor=model.processor, tokenizer=model.tokenizer)
model.eval()

messages = [
    {"role": "system", "content": 'You are a helpful assistant.'},
    {"role": "user", "content": "描述图片内容"},
]

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = load_image(url)

q_text = processor.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=False
)

inputs = processor(
    text=[q_text],
    images=image,
    padding=True,
    return_tensors="pt",
).to(device)

output_ids = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_k=20,
    top_p=0.8,
    do_sample=True,
    repetition_penalty=1.1,
)

output_ids = output_ids[0].tolist()

try:
    index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
    index = 0

content = processor.decode(output_ids[index:], skip_special_tokens=True)
print("content:", content)
Downloads last month
40
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TianYeZ1214/Qwenov3

Finetuned
Qwen/Qwen3-0.6B
Finetuned
(331)
this model

Datasets used to train TianYeZ1214/Qwenov3