---
library_name: peft
base_model: Qwen/Qwen2-VL-7B-Instruct
tags:
- visual-grounding
- qwen2-vl
- multimodal
---

# Visual Grounding Adapter

Fine-tuned adapter for Qwen2-VL-7B for visual grounding tasks.

## Usage

```python
from transformers import Qwen2VLForConditionalGeneration, AutoProcessor
from peft import PeftModel
import torch

# Load base
model = Qwen2VLForConditionalGeneration.from_pretrained(
    "Qwen/Qwen2-VL-7B-Instruct",
    device_map="auto",
    torch_dtype=torch.float16
)

# Load adapter
model = PeftModel.from_pretrained(model, "YOUR_USERNAME/visual-grounding-adapter")
processor = AutoProcessor.from_pretrained("Qwen/Qwen2-VL-7B-Instruct")
```

## Training

- Dataset: Custom diagrams with bounding boxes
- LoRA rank: 8-16
- Epochs: 2-3
- Hardware: Google Colab T4