laion
/

CLIP-ViT-B-32-256x256-DataComp-s34B-b86K

Zero-Shot Image Classification

Model card Files Files and versions

mehdidc commited on Jun 11

Commit

7842cb0

·

verified ·

1 Parent(s): b076acc

Update README.md

Files changed (1) hide show

README.md +44 -1

README.md CHANGED Viewed

@@ -171,6 +171,49 @@ OpenCLIP software
 }
 ```
 # How to Get Started with the Model
-See https://github.com/mlfoundations/open_clip

 }
 ```
+CLIP benchmark software
+```
+@software{cherti_2025_15403103,
+  author       = {Cherti, Mehdi and
+                  Beaumont, Romain},
+  title        = {CLIP benchmark},
+  month        = may,
+  year         = 2025,
+  publisher    = {Zenodo},
+  doi          = {10.5281/zenodo.15403103},
+  url          = {https://doi.org/10.5281/zenodo.15403103},
+  swhid        = {swh:1:dir:8cf49a5dd06f59224844a1e767337a1d14ee56c2
+                   ;origin=https://doi.org/10.5281/zenodo.15403102;vi
+                   sit=swh:1:snp:dd153b26f702d614346bf814f723d59fef3d
+                   77a2;anchor=swh:1:rel:cff2aeb98f42583b44fdab5374e9
+                   fa71793f2cff;path=CLIP\\_benchmark-main
+                  },
+}
 # How to Get Started with the Model
+Zero-shot classification example:
+```python
+import torch
+from PIL import Image
+import open_clip
+model, _, preprocess = open_clip.create_model_and_transforms('hf-hub:laion/CLIP-ViT-B-32-256x256-DataComp-s34B-b86K')
+model.eval()  # model in train mode by default, impacts some models with BatchNorm or stochastic depth active
+tokenizer = open_clip.get_tokenizer('hf-hub:laion/CLIP-ViT-B-32-256x256-DataComp-s34B-b86K')
+image = preprocess(Image.open("docs/CLIP.png")).unsqueeze(0)
+text = tokenizer(["a diagram", "a dog", "a cat"])
+with torch.no_grad(), torch.autocast("cuda"):
+    image_features = model.encode_image(image)
+    text_features = model.encode_text(text)
+    image_features /= image_features.norm(dim=-1, keepdim=True)
+    text_features /= text_features.norm(dim=-1, keepdim=True)
+    text_probs = (100.0 * image_features @ text_features.T).softmax(dim=-1)
+print("Label probs:", text_probs)  # prints: [[1., 0., 0.]]
+```