File size: 2,442 Bytes
f287aea 00b50cb f287aea 00b50cb f287aea 00b50cb f287aea 00b50cb f287aea 00b50cb 9b26663 00b50cb 529258c 00b50cb 8d00249 00b50cb f287aea 00b50cb 8d00249 00b50cb 8d00249 00b50cb f287aea 8d00249 f287aea 00b50cb f287aea 00b50cb f287aea 00b50cb f287aea 00b50cb f287aea 00b50cb f287aea 00b50cb f287aea 00b50cb f287aea 00b50cb f287aea 00b50cb f287aea 8d00249 f287aea |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 |
---
tags:
- image-to-text
- image-captioning
- endpoints-template
license: bsd-3-clause
library_name: generic
---
# Fork of [Salesforce/blip-image-captioning-large](https://huggingface.co/Salesforce/blip-image-captioning-large) for a `image-captioning` task on 🤗Inference endpoint.
This repository implements a `custom` task for `image-captioning` for 🤗 Inference Endpoints. The code for the customized pipeline is in the [pipeline.py](https://huggingface.co/florentgbelidji/blip_captioning/blob/main/pipeline.py).
To use deploy this model a an Inference Endpoint you have to select `Custom` as task to use the `handler.py` file. -> _double check if it is selected_
### expected Request payload
```json
{
"image": "/9j/4AAQSkZJRgA.....", #encoded image
"text": "a photography of a"
}
```
below is an example on how to run a request using Python and `requests`.
## Run Request
1. Use any online image.
```bash
!wget https://storage.googleapis.com/sfr-vision-language-research/BLIP/demo.jpg
```
2.run request
```python
import json
from typing import List
import requests as r
import base64
with open("/content/demo.jpg", "rb") as image_file:
encoded_string = base64.b64encode(image_file.read()).decode()
ENDPOINT_URL = ""
HF_TOKEN = ""
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({
"inputs": {
"images": [encoded_string], # using the base64 encoded string
"texts": ["a photography of"] # Optional, based on your current class logic
}
})
print(output)
```
Example parameters depending on the decoding strategy:
1. Beam search
```
"parameters": {
"num_beams":5,
"max_length":20
}
```
2. Nucleus sampling
```
"parameters": {
"num_beams":1,
"max_length":20,
"do_sample": True,
"top_k":50,
"top_p":0.95
}
```
3. Contrastive search
```
"parameters": {
"penalty_alpha":0.6,
"top_k":4
"max_length":512
}
```
See [generate()](https://huggingface.co/docs/transformers/v4.25.1/en/main_classes/text_generation#transformers.GenerationMixin.generate) doc for additional detail
expected output
```python
{'captions': ['a photography of a woman and her dog on the beach']}
```
|