File size: 15,146 Bytes
76531ae bd633b6 2d1ec66 0a4be84 86de95a 0a4be84 b76c4e4 a95656a b76c4e4 0a4be84 f81c061 f9354fe 0a4be84 bd633b6 6e0dcf2 bd633b6 0a4be84 f263e76 615a32d bd633b6 0a4be84 bd633b6 0a4be84 bd633b6 0a4be84 bd633b6 0a4be84 65582b4 0a4be84 dcdd148 0a4be84 9e2c5b2 3ff81b4 9e2c5b2 0a4be84 8cb8316 4599c2f 0a4be84 bd633b6 0a4be84 bd633b6 eacfbd6 0a4be84 4599c2f 0a4be84 8cb8316 0a4be84 eacfbd6 bd633b6 0a4be84 bd633b6 363b0b4 8cb8316 4599c2f bd633b6 0a4be84 eacfbd6 0c6dcec eacfbd6 0a4be84 e348908 0a4be84 93d93e4 0a4be84 de309f3 8cb8316 0a4be84 8cb8316 0a4be84 de309f3 4599c2f de309f3 0a4be84 de309f3 8cb8316 5c40565 0a4be84 3cda643 0a4be84 bd633b6 3cda643 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 |
---
license: apache-2.0
---
This is a LFS clone of [github.com/zhengchen1999/DAT](https://github.com/zhengchen1999/DAT/tree/6846c8798e4a0579451982bd8b6e441f4b1e78b0) with uploaded pretrained models.<br>
All credit goes through original Author.<br>
The purpose of this clone is to allow easier automated download of the pretrained models for [github.com/AUTOMATIC1111/stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui).<br>
Note the commit hash is different from the origin repo as the history has been rewritten as a result of converting to LFS.
---
# Dual Aggregation Transformer for Image Super-Resolution
[Zheng Chen](https://zhengchen1999.github.io/), [Yulun Zhang](http://yulunzhang.com/), [Jinjin Gu](https://www.jasongt.com/), [Linghe Kong](https://www.cs.sjtu.edu.cn/~linghe.kong/), [Xiaokang Yang](https://scholar.google.com/citations?user=yDEavdMAAAAJ&hl), and [Fisher Yu](https://www.yf.io/), "Dual Aggregation Transformer for Image Super-Resolution", ICCV, 2023
[[paper](https://openaccess.thecvf.com/content/ICCV2023/papers/Chen_Dual_Aggregation_Transformer_for_Image_Super-Resolution_ICCV_2023_paper.pdf)] [[arXiv](http://arxiv.org/abs/2308.03364)] [[supplementary material](https://github.com/zhengchen1999/DAT/releases)] [[visual results](https://drive.google.com/drive/folders/1ZMaZyCer44ZX6tdcDmjIrc_hSsKoMKg2?usp=drive_link)] [[pretrained models](https://drive.google.com/drive/folders/1iBdf_-LVZuz_PAbFtuxSKd_11RL1YKxM?usp=drive_link)]
#### π₯π₯π₯ News
- **2023-09-17:** The [chaiNNer](https://github.com/chaiNNer-org/chaiNNer) and the [neosr](https://github.com/muslll/neosr) add DAT support. Additional trained DAT models are available in [OpenMMLab](https://openmodeldb.info/?sort=date-desc&t=arch%3Adat) ([#11](https://github.com/zhengchen1999/DAT/issues/11)). Thank [Phhofm](https://github.com/Phhofm)!
- **2023-07-16:** This repo is released.
- **2023-07-14:** DAT is accepted at ICCV 2023. πππ
---
> **Abstract:** *Transformer has recently gained considerable popularity in low-level vision tasks, including image super-resolution (SR). These networks utilize self-attention along different dimensions, spatial or channel, and achieve impressive performance. This inspires us to combine the two dimensions in Transformer for a more powerful representation capability. Based on the above idea, we propose a novel Transformer model, Dual Aggregation Transformer (DAT), for image SR. Our DAT aggregates features across spatial and channel dimensions, in the inter-block and intra-block dual manner. Specifically, we alternately apply spatial and channel self-attention in consecutive Transformer blocks. The alternate strategy enables DAT to capture the global context and realize inter-block feature aggregation. Furthermore, we propose the adaptive interaction module (AIM) and the spatial-gate feed-forward network (SGFN) to achieve intra-block feature aggregation. AIM complements two self-attention mechanisms from corresponding dimensions. Meanwhile, SGFN introduces additional non-linear spatial information in the feed-forward network. Extensive experiments show that our DAT surpasses current methods.*

---
| HR | LR | [SwinIR](https://github.com/JingyunLiang/SwinIR) | [CAT](https://github.com/zhengchen1999/CAT) | DAT (ours) |
| :------------------------------------------: | :-----------------------------------------------: | :----------------------------------------------: | :-------------------------------------------: | :-------------------------------------------: |
| <img src="figs/img_059_HR_x4.png" height=80> | <img src="figs/img_059_Bicubic_x4.png" height=80> | <img src="figs/img_059_SwinIR_x4.png" height=80> | <img src="figs/img_059_CAT_x4.png" height=80> | <img src="figs/img_059_DAT_x4.png" height=80> |
| <img src="figs/img_049_HR_x4.png" height=80> | <img src="figs/img_049_Bicubic_x4.png" height=80> | <img src="figs/img_049_SwinIR_x4.png" height=80> | <img src="figs/img_049_CAT_x4.png" height=80> | <img src="figs/img_049_DAT_x4.png" height=80> |
## Dependencies
- Python 3.8
- PyTorch 1.8.0
- NVIDIA GPU + [CUDA](https://developer.nvidia.com/cuda-downloads)
```bash
# Clone the github repo and go to the default directory 'DAT'.
git clone https://github.com/zhengchen1999/DAT.git
conda create -n DAT python=3.8
conda activate DAT
pip install -r requirements.txt
python setup.py develop
```
## Contents
1. [Datasets](#Datasets)
1. [Models](#Models)
1. [Training](#Training)
1. [Testing](#Testing)
1. [Results](#Results)
1. [Citation](#Citation)
1. [Acknowledgements](#Acknowledgements)
---
## Datasets
Used training and testing sets can be downloaded as follows:
| Training Set | Testing Set | Visual Results |
| :----------------------------------------------------------- | :----------------------------------------------------------: | :----------------------------------------------------------: |
| [DIV2K](https://data.vision.ee.ethz.ch/cvl/DIV2K/) (800 training images, 100 validation images) + [Flickr2K](https://cv.snu.ac.kr/research/EDSR/Flickr2K.tar) (2650 images) [complete training dataset DF2K: [Google Drive](https://drive.google.com/file/d/1TubDkirxl4qAWelfOnpwaSKoj3KLAIG4/view?usp=share_link) / [Baidu Disk](https://pan.baidu.com/s/1KIcPNz3qDsGSM0uDKl4DRw?pwd=74yc)] | Set5 + Set14 + BSD100 + Urban100 + Manga109 [complete testing dataset: [Google Drive](https://drive.google.com/file/d/1yMbItvFKVaCT93yPWmlP3883XtJ-wSee/view?usp=sharing) / [Baidu Disk](https://pan.baidu.com/s/1Tf8WT14vhlA49TO2lz3Y1Q?pwd=8xen)] | [Google Drive](https://drive.google.com/drive/folders/1ZMaZyCer44ZX6tdcDmjIrc_hSsKoMKg2?usp=drive_link) / [Baidu Disk](https://pan.baidu.com/s/1LO-INqy40F5T_coAJsl5qw?pwd=dqnv#list/path=%2F) |
Download training and testing datasets and put them into the corresponding folders of `datasets/`. See [datasets](datasets/README.md) for the detail of the directory structure.
## Models
| Method | Params | FLOPs (G) | Dataset | PSNR (dB) | SSIM | Model Zoo | Visual Results |
| :-------- | :----: | :-------: | :------: | :-------: | :----: | :----------------------------------------------------------: | :----------------------------------------------------------: |
| DAT-S | 11.21M | 203.34 | Urban100 | 27.68 | 0.8300 | [Google Drive](https://drive.google.com/drive/folders/1hM0v3fUg5u6GjkI7dduxShyGgGfEwQXO?usp=drive_link) / [Baidu Disk](https://pan.baidu.com/s/1rgkCyqEJdZlHvQ6_Dwb3rA?pwd=4rfr) | [Google Drive](https://drive.google.com/file/d/1x1ixMswxw5w-zeZ_Rap5Nk4Tr46MIjAw/view?usp=drive_link) / [Baidu Disk](https://pan.baidu.com/s/1LO-INqy40F5T_coAJsl5qw?pwd=dqnv) |
| DAT | 14.80M | 275.75 | Urban100 | 27.87 | 0.8343 | [Google Drive](https://drive.google.com/drive/folders/14VG5mw5ie8RrR4jjypeHynXDZYWL8w-r?usp=drive_link) / [Baidu Disk](https://pan.baidu.com/s/1rgkCyqEJdZlHvQ6_Dwb3rA?pwd=4rfr) | [Google Drive](https://drive.google.com/file/d/1K43CTsXpoX5St5fed4kEW9gu2KMR6hLu/view?usp=drive_link) / [Baidu Disk](https://pan.baidu.com/s/1LO-INqy40F5T_coAJsl5qw?pwd=dqnv) |
| DAT-2 | 11.21M | 216.93 | Urban100 | 27.86 | 0.8341 | [Google Drive](https://drive.google.com/drive/folders/1yV9LMhr2tYM_eHEIVY4Jw9X3bWGgorbD?usp=drive_link) / [Baidu Disk](https://pan.baidu.com/s/1rgkCyqEJdZlHvQ6_Dwb3rA?pwd=4rfr) | [Google Drive](https://drive.google.com/file/d/1TQRZIg8at5HX87OCu3GYytZhYGperkuN/view?usp=drive_link) / [Baidu Disk](https://pan.baidu.com/s/1LO-INqy40F5T_coAJsl5qw?pwd=dqnv) |
| DAT-light | 573K | 49.69 | Urban100 | 26.64 | 0.8033 | [Google Drive](https://drive.google.com/drive/folders/105JRMN5VJbJ7EMQJdqmhDVMAFCaKYDl8?usp=drive_link) / [Baidu Disk](https://pan.baidu.com/s/1rgkCyqEJdZlHvQ6_Dwb3rA?pwd=4rfr) | [Google Drive](https://drive.google.com/file/d/1xKxK6_UcqAWK2m5znQX_LssWndmN-End/view?usp=drive_link) / [Baidu Disk](https://pan.baidu.com/s/1LO-INqy40F5T_coAJsl5qw?pwd=dqnv) |
The performance is reported on Urban100 (x4). DAT-S, DAT, DAT-2: output size of FLOPs is 3Γ512Γ512. DAT-light: output size of FLOPs is 3Γ1280Γ720.
## Training
- Download [training](https://drive.google.com/file/d/1TubDkirxl4qAWelfOnpwaSKoj3KLAIG4/view?usp=share_link) (DF2K, already processed) and [testing](https://drive.google.com/file/d/1yMbItvFKVaCT93yPWmlP3883XtJ-wSee/view?usp=sharing) (Set5, Set14, BSD100, Urban100, Manga109, already processed) datasets, place them in `datasets/`.
- Run the following scripts. The training configuration is in `options/train/`.
```shell
# DAT-S, input=64x64, 4 GPUs
python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_S_x2.yml --launcher pytorch
python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_S_x3.yml --launcher pytorch
python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_S_x4.yml --launcher pytorch
# DAT, input=64x64, 4 GPUs
python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_x2.yml --launcher pytorch
python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_x3.yml --launcher pytorch
python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_x4.yml --launcher pytorch
# DAT-2, input=64x64, 4 GPUs
python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_2_x2.yml --launcher pytorch
python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_2_x3.yml --launcher pytorch
python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_2_x4.yml --launcher pytorch
# DAT-light, input=64x64, 4 GPUs
python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_light_x2.yml --launcher pytorch
python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_light_x3.yml --launcher pytorch
python -m torch.distributed.launch --nproc_per_node=4 --master_port=4321 basicsr/train.py -opt options/Train/train_DAT_light_x4.yml --launcher pytorch
```
- The training experiment is in `experiments/`.
## Testing
### Test images with HR
- Download the pre-trained [models](https://drive.google.com/drive/folders/1iBdf_-LVZuz_PAbFtuxSKd_11RL1YKxM?usp=drive_link) and place them in `experiments/pretrained_models/`.
We provide pre-trained models for image SR: DAT-S, DAT, DAT-2, and DAT-light (x2, x3, x4).
- Download [testing](https://drive.google.com/file/d/1yMbItvFKVaCT93yPWmlP3883XtJ-wSee/view?usp=sharing) (Set5, Set14, BSD100, Urban100, Manga109) datasets, place them in `datasets/`.
- Run the following scripts. The testing configuration is in `options/test/` (e.g., [test_DAT_x2.yml](options/Test/test_DAT_x2.yml)).
Note 1: You can set `use_chop: True` (default: False) in YML to chop the image for testing.
```shell
# No self-ensemble
# DAT-S, reproduces results in Table 2 of the main paper
python basicsr/test.py -opt options/Test/test_DAT_S_x2.yml
python basicsr/test.py -opt options/Test/test_DAT_S_x3.yml
python basicsr/test.py -opt options/Test/test_DAT_S_x4.yml
# DAT, reproduces results in Table 2 of the main paper
python basicsr/test.py -opt options/Test/test_DAT_x2.yml
python basicsr/test.py -opt options/Test/test_DAT_x3.yml
python basicsr/test.py -opt options/Test/test_DAT_x4.yml
# DAT-2, reproduces results in Table 1 of the supplementary material
python basicsr/test.py -opt options/Test/test_DAT_2_x2.yml
python basicsr/test.py -opt options/Test/test_DAT_2_x3.yml
python basicsr/test.py -opt options/Test/test_DAT_2_x4.yml
# DAT-light, reproduces results in Table 2 of the supplementary material
python basicsr/test.py -opt options/Test/test_DAT_light_x2.yml
python basicsr/test.py -opt options/Test/test_DAT_light_x3.yml
python basicsr/test.py -opt options/Test/test_DAT_light_x4.yml
```
- The output is in `results/`.
### Test images without HR
- Download the pre-trained [models](https://drive.google.com/drive/folders/1iBdf_-LVZuz_PAbFtuxSKd_11RL1YKxM?usp=drive_link) and place them in `experiments/pretrained_models/`.
We provide pre-trained models for image SR: DAT-S, DAT, and DAT-2 (x2, x3, x4).
- Put your dataset (single LR images) in `datasets/single`. Some test images are in this folder.
- Run the following scripts. The testing configuration is in `options/test/` (e.g., [test_single_x2.yml](options/Test/test_single_x2.yml)).
Note 1: The default model is DAT. You can use other models like DAT-S by modifying the YML.
Note 2: You can set `use_chop: True` (default: False) in YML to chop the image for testing.
```shell
# Test on your dataset
python basicsr/test.py -opt options/Test/test_single_x2.yml
python basicsr/test.py -opt options/Test/test_single_x3.yml
python basicsr/test.py -opt options/Test/test_single_x4.yml
```
- The output is in `results/`.
## Results
We achieved state-of-the-art performance. Detailed results can be found in the paper. All visual results of DAT can be downloaded [here](https://drive.google.com/drive/folders/1ZMaZyCer44ZX6tdcDmjIrc_hSsKoMKg2?usp=drive_link).
<details>
<summary>Click to expand</summary>
- results in Table 2 of the main paper
<p align="center">
<img width="900" src="figs/Table-1.png">
</p>
- results in Table 1 of the supplementary material
<p align="center">
<img width="900" src="figs/Table-2.png">
</p>
- results in Table 2 of the supplementary material
<p align="center">
<img width="900" src="figs/Table-3.png">
</p>
- visual comparison (x4) in the main paper
<p align="center">
<img width="900" src="figs/Figure-1.png">
</p>
- visual comparison (x4) in the supplementary material
<p align="center">
<img width="900" src="figs/Figure-2.png">
<img width="900" src="figs/Figure-3.png">
<img width="900" src="figs/Figure-4.png">
<img width="900" src="figs/Figure-5.png">
</p>
</details>
## Citation
If you find the code helpful in your research or work, please cite the following paper(s).
```
@inproceedings{chen2023dual,
title={Dual Aggregation Transformer for Image Super-Resolution},
author={Chen, Zheng and Zhang, Yulun and Gu, Jinjin and Kong, Linghe and Yang, Xiaokang and Yu, Fisher},
booktitle={ICCV},
year={2023}
}
```
## Acknowledgements
This code is built on [BasicSR](https://github.com/XPixelGroup/BasicSR).
|