--- library_name: peft base_model: Qwen/Qwen2-VL-7B-Instruct tags: - visual-grounding - qwen2-vl - multimodal --- # Visual Grounding Adapter Fine-tuned adapter for Qwen2-VL-7B for visual grounding tasks. ## Usage ```python from transformers import Qwen2VLForConditionalGeneration, AutoProcessor from peft import PeftModel import torch # Load base model = Qwen2VLForConditionalGeneration.from_pretrained( "Qwen/Qwen2-VL-7B-Instruct", device_map="auto", torch_dtype=torch.float16 ) # Load adapter model = PeftModel.from_pretrained(model, "YOUR_USERNAME/visual-grounding-adapter") processor = AutoProcessor.from_pretrained("Qwen/Qwen2-VL-7B-Instruct") ``` ## Training - Dataset: Custom diagrams with bounding boxes - LoRA rank: 8-16 - Epochs: 2-3 - Hardware: Google Colab T4