Image Classification
Transformers
File size: 2,227 Bytes
c2a19b1
 
235eaa2
 
c2a19b1
 
 
 
235eaa2
 
 
c2a19b1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
license: cc-by-nc-sa-4.0
library_name: transformers
pipeline_tag: image-classification
---

# SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning (ICLR 2024)

This repository contains the model described in https://arxiv.org/abs/2403.13684.

Code: https://github.com/Visual-AI/SPTNet

<p align="center">
    <a href="https://arxiv.org/abs/2403.13684"><img src="https://img.shields.io/badge/arXiv-2403.13684-b31b1b"></a> <a href="https://visual-ai.github.io/sptnet/"><img src="https://img.shields.io/badge/Project-Website-blue"></a><a href="#jump"><img src="https://img.shields.io/badge/Citation-8A2BE2"></a>
</p>
<p align="center">
	SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning <br>
  By
  <a href="https://whj363636.github.io/">Hongjun Wang</a>, 
  <a href="https://sgvaze.github.io/">Sagar Vaze</a>, and 
  <a href="https://www.kaihan.org/">Kai Han</a>.
</p>


[05.2024] We update the results of SPTNet with DINOv2 on CUB, please check our latest version in [Arxiv](https://arxiv.org/abs/2403.13684) 

|               | All  | Old  | New  |
|---------------|------|------|------|
| CUB (DINO)           | 65.8 | 68.8 | 65.1 |
| CUB (DINOv2)         | 76.3 | 79.5 | 74.6 |



## Results
Generic results:
|              | All  | Old  | New  |
|--------------|------|------|------|
| CIFAR-10     | 97.3 | 95.0 | 98.6 |
| CIFAR-100    | 81.3 | 84.3 | 75.6 |
| ImageNet-100 | 85.4 | 93.2 | 81.4 |

Fine-grained results:
|               | All  | Old  | New  |
|---------------|------|------|------|
| CUB           | 65.8 | 68.8 | 65.1 |
| Stanford Cars | 59.0 | 79.2 | 49.3 |
| FGVC-Aircraft | 59.3 | 61.8 | 58.1 |
| Herbarium19   | 43.4 | 58.7 | 35.2 |



## Citing this work
<span id="jump"></span>
If you find this repo useful for your research, please consider citing our paper:

```
@inproceedings{wang2024sptnet,
    author    = {Wang, Hongjun and Vaze, Sagar and Han, Kai},
    title     = {SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning},
    booktitle = {International Conference on Learning Representations (ICLR)},
    year      = {2024}
}
```