w-e-w
/

DAT

Model card Files Files and versions

xet

Community

463465810cz commited on Jul 17, 2023

Commit

5c40565

1 Parent(s): 8cb8316

ICCV 2023

Browse files

Former-commit-id: 9c52469c9ec250295ed02e18b0db068ad04bb4e9

Files changed (1) hide show

README.md +2 -5

README.md CHANGED Viewed

@@ -6,7 +6,7 @@
 ---
-> **Abstract:** *Transformer-based methods have recently been widely used in low-level vision tasks, including image super-resolution (SR). These networks utilize self-attention along different dimensions, spatial or channel, and achieve impressive performance. This inspires us to combine the two dimensions in Transformer for a more powerful representation capability. Based on the above idea, we propose a novel Transformer model, Dual Aggregation Transformer (DAT), for image SR. Our DAT aggregates features across spatial and channel dimensions, in the inter-block and intra-block dual manner. Specifically, we alternately apply spatial and channel self-attention in consecutive Transformer blocks. The alternate strategy enables DAT to capture the global context and realize inter-block feature aggregation. Furthermore, we propose the adaptive interaction module (AIM) and the spatial-gate feed-forward network (SGFN) to achieve intra-block feature aggregation. AIM complements two self-attention mechanisms from corresponding dimensions. Meanwhile, SGFN introduces additional non-linear spatial information in the feed-forward network. Extensive experiments show that our DAT surpasses current state-of-the-art methods.*
 >
 > <p align="center">
 > <img width="800" src="figs/DAT.png">
@@ -150,10 +150,7 @@ We achieved state-of-the-art performance. Detailed results can be found in the p
   <img width="900" src="figs/Figure-4.png">
   <img width="900" src="figs/Figure-5.png">
 </p>
-- </details>
 ## Citation

 ---
+> **Abstract:** Transformer-based methods have recently been widely used in low-level vision tasks, including image super-resolution (SR). These networks utilize self-attention along different dimensions, spatial or channel, and achieve impressive performance. This inspires us to combine the two dimensions in Transformer for a more powerful representation capability. Based on the above idea, we propose a novel Transformer model, Dual Aggregation Transformer (DAT), for image SR. Our DAT aggregates features across spatial and channel dimensions, in the inter-block and intra-block dual manner. Specifically, we alternately apply spatial and channel self-attention in consecutive Transformer blocks. The alternate strategy enables DAT to capture the global context and realize inter-block feature aggregation. Furthermore, we propose the adaptive interaction module (AIM) and the spatial-gate feed-forward network (SGFN) to achieve intra-block feature aggregation. AIM complements two self-attention mechanisms from corresponding dimensions. Meanwhile, SGFN introduces additional non-linear spatial information in the feed-forward network. Extensive experiments show that our DAT surpasses current methods.*
 >
 > <p align="center">
 > <img width="800" src="figs/DAT.png">
   <img width="900" src="figs/Figure-4.png">
   <img width="900" src="figs/Figure-5.png">
 </p>
+</details>
 ## Citation