Visual Grounding Adapter

Fine-tuned adapter for Qwen2-VL-7B for visual grounding tasks.

Usage

from transformers import Qwen2VLForConditionalGeneration, AutoProcessor
from peft import PeftModel
import torch

# Load base
model = Qwen2VLForConditionalGeneration.from_pretrained(
    "Qwen/Qwen2-VL-7B-Instruct",
    device_map="auto",
    torch_dtype=torch.float16
)

# Load adapter
model = PeftModel.from_pretrained(model, "YOUR_USERNAME/visual-grounding-adapter")
processor = AutoProcessor.from_pretrained("Qwen/Qwen2-VL-7B-Instruct")

Training

  • Dataset: Custom diagrams with bounding boxes
  • LoRA rank: 8-16
  • Epochs: 2-3
  • Hardware: Google Colab T4
Downloads last month
34
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ishanmakkar/visual-grounding-adapter

Base model

Qwen/Qwen2-VL-7B
Adapter
(186)
this model