FG-CLIP: Fine-Grained Visual and Textual Alignment
Paper
•
2505.05071
•
Published
•
18
New generation of CLIP with strong fine grained discrimination capability
Identify objects in images using labels
Visualize similarity between image and label