OWLv2 Inference Endpoint

Custom handler for OWLv2 (Open-World Localization v2) supporting both image-conditioned and text-conditioned object detection.

Features

  • Image-conditioned detection: Find objects similar to a reference image
  • Text-conditioned detection: Find objects matching text descriptions
  • Multiple query images: Search for several different objects at once

Usage

Image-Conditioned Detection

Find all instances of an icon/object in a target image:

import requests
import base64

API_URL = "https://your-endpoint.endpoints.huggingface.cloud"
headers = {"Authorization": "Bearer YOUR_TOKEN"}

# Load images as base64
with open("screenshot.png", "rb") as f:
    target_b64 = base64.b64encode(f.read()).decode()
with open("icon.png", "rb") as f:
    query_b64 = base64.b64encode(f.read()).decode()

response = requests.post(API_URL, headers=headers, json={
    "inputs": {
        "target_image": target_b64,
        "query_image": query_b64,
        "threshold": 0.5
    }
})

print(response.json())
# {"detections": [{"box": [100, 200, 150, 250], "confidence": 0.92}]}

Text-Conditioned Detection

Find objects by description:

response = requests.post(API_URL, headers=headers, json={
    "inputs": {
        "target_image": target_b64,
        "queries": ["a play button", "a settings icon"],
        "threshold": 0.1
    }
})

Multiple Query Images

Find several different objects:

response = requests.post(API_URL, headers=headers, json={
    "inputs": {
        "target_image": target_b64,
        "query_images": [icon1_b64, icon2_b64, icon3_b64],
        "threshold": 0.5
    }
})
# Results include "label": "query_0", "query_1", etc.

Parameters

Parameter Type Default Description
target_image string required Base64-encoded target image
query_image string - Base64-encoded reference image
query_images array - Multiple base64-encoded reference images
queries array - Text descriptions to search for
threshold float 0.5 Confidence threshold (0-1)
nms_threshold float 0.3 Non-max suppression threshold

Response Format

{
  "detections": [
    {
      "box": [x1, y1, x2, y2],
      "confidence": 0.95,
      "label": "query_0"
    }
  ]
}

Model

Uses google/owlv2-large-patch14-ensemble for best accuracy.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support