OWLv2 Inference Endpoint
Custom handler for OWLv2 (Open-World Localization v2) supporting both image-conditioned and text-conditioned object detection.
Features
- Image-conditioned detection: Find objects similar to a reference image
- Text-conditioned detection: Find objects matching text descriptions
- Multiple query images: Search for several different objects at once
Usage
Image-Conditioned Detection
Find all instances of an icon/object in a target image:
import requests
import base64
API_URL = "https://your-endpoint.endpoints.huggingface.cloud"
headers = {"Authorization": "Bearer YOUR_TOKEN"}
# Load images as base64
with open("screenshot.png", "rb") as f:
target_b64 = base64.b64encode(f.read()).decode()
with open("icon.png", "rb") as f:
query_b64 = base64.b64encode(f.read()).decode()
response = requests.post(API_URL, headers=headers, json={
"inputs": {
"target_image": target_b64,
"query_image": query_b64,
"threshold": 0.5
}
})
print(response.json())
# {"detections": [{"box": [100, 200, 150, 250], "confidence": 0.92}]}
Text-Conditioned Detection
Find objects by description:
response = requests.post(API_URL, headers=headers, json={
"inputs": {
"target_image": target_b64,
"queries": ["a play button", "a settings icon"],
"threshold": 0.1
}
})
Multiple Query Images
Find several different objects:
response = requests.post(API_URL, headers=headers, json={
"inputs": {
"target_image": target_b64,
"query_images": [icon1_b64, icon2_b64, icon3_b64],
"threshold": 0.5
}
})
# Results include "label": "query_0", "query_1", etc.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
target_image |
string | required | Base64-encoded target image |
query_image |
string | - | Base64-encoded reference image |
query_images |
array | - | Multiple base64-encoded reference images |
queries |
array | - | Text descriptions to search for |
threshold |
float | 0.5 | Confidence threshold (0-1) |
nms_threshold |
float | 0.3 | Non-max suppression threshold |
Response Format
{
"detections": [
{
"box": [x1, y1, x2, y2],
"confidence": 0.95,
"label": "query_0"
}
]
}
Model
Uses google/owlv2-large-patch14-ensemble for best accuracy.