Description of available models

The models are variational autoencoders (VAEs) and compressive autoencoders (CAEs), with an additional variance decoder, that can be used for restoring images using Variational Bayes Latent Estimation (VBLE) algorithm.

Associated GitHub Repository: Github Repo
Associated Papers: Deep Priors for satellite image restoration with accurate uncertainties, Variational Bayes image restoration with compressive autoencoders

Models Details

The models are simple VAEs trained on CelebA, and CAEs mbt [1] and cheng [2] trained on different datasets: FFHQ [3], BSDS500 [4], and a realistic satellite dataset simulated from PCRS [5].

Quick description of all models

1lvae-fcb_gamma-variable_M-64_celeba-wb_std-diagonal:

Architecture: VAE with fully connected bottleneck, latent_dimension = 64.
Dataset: Celeba (white and black)

1lvae-fcb_gamma-variable_M-256_celeba-wb_std-diagonal:

Architecture: VAE with fully connected bottleneck, latent_dimension = 256.
Dataset: Celeba (white and black)

1lvae-light_gamma-variable_M-64_celeba-wb_std-diagonal:

Architecture: VAE with fully convolutionnal bottleneck, latent_dimension = 64.
Dataset: Celeba (white and black)

cheng_0.0483_bsd_std-diagonal

Architecture: cheng [2] model with latent dimension M = 192
Dataset: BSDS500 (RGB)
Bitrate parameter alpha = 0.0483 (medium bitrate model)

cheng_0.0483_ffhq_std-diagonal

Architecture: cheng [2] model with latent dimension M = 192
Dataset: FFHQ256 (RGB)
Bitrate parameter alpha = 0.0483 (medium bitrate model)

cheng_0.1800_bsd_std-diagonal

Architecture: cheng [2] model with latent dimension M = 192
Dataset: on BSDS500 (RGB)
Bitrate parameter alpha = 0.1800 (high bitrate model)

cheng_0.1800_ffhq_std-diagonal

Architecture: cheng [2] model with latent dimension M = 192
Dataset: FFHQ256 (RGB)
Bitrate parameter alpha = 0.1800 (high bitrate model)

mbt_0.0483_bsd_std-diagonal

Architecture: mbt [1] model with latent dimension M = 320
Dataset: BSDS500 (RGB)
Bitrate parameter alpha = 0.0483 (medium bitrate model)

mbt_0.0483_ffhq_std-diagonal

Architecture: mbt [1] model with latent dimension M = 320
Dataset: FFHQ256 (RGB)
Bitrate parameter alpha = 0.0483 (medium bitrate model)

mbt_0.1800_bsd_std-diagonal

Architecture: mbt [1] model with latent dimension M = 320
Dataset: BSDS500 (RGB)
Bitrate parameter alpha = 0.1800 (high bitrate model)

mbt_0.1800_ffhq_std-diagonal

Architecture: mbt [1] model with latent dimension M = 320
Dataset: FFHQ256 (RGB)
Bitrate parameter alpha = 0.1800 (high bitrate model)

mbt_25cm_PCRS_0.3600_std-diagonal:

Architecture: mbt [1] model with latent dimension M = 320.
Dataset: PCRS (satellite images downsampled at 25cm resolution, white and black)
Bitrate parameter alpha = 0.3600 (very high bitrate model)

mbt_50cm_PCRS_0.3600_std-diagonal:

Architecture: mbt [1] model with latent dimension M = 320.
Dataset: PCRS (satellite images downsampled at 50cm resolution, white and black)
Bitrate parameter alpha = 0.3600 (very high bitrate model)

Training Details

Training Procedure

Pretraining: None for VAEs, pretrained model from CompressAI for CAEs.

Two-stage training:

Encoder and decoder finetuning.
Variance decoder training (from scratch), with MLE_loss and a diagonal Gaussian decoder model.

VAEs specificities: Isotropic decoder model of deviation gamma optimized as a NN parameter in first stage.

CAEs specificities: Fixed bitrate parameter alpha (=equivalent of gamma for CAEs) in first stage.

Training Hyperparameters

Training regime: fp32

First stage

lr=1e-4
optimizer=Adam
batch_size=256 for VAEs, batch_size=16 for CAEs
patch_size=64 for CelebA, patch_size=256 otherwise
clip_max_norm=20 (gradient clipping)
parameters=autoencoder
loss_type=elbo
sample_rate_scale=false (wether to modulate the bitrate for CAEs during training)

Second stage

lr=1e-4
optimizer=Adam
batch_size=256 for VAEs, batch_size=16 for CAEs
patch_size=64 for CelebA, patch_size=256 otherwise
clip_max_norm=1 (gradient clipping)
parameters=dec_variance
loss_type=elbo
sample_rate_scale=true for CAEs, false for VAEs(wether to modulate the bitrate for CAEs during training)

Model Architecture and Objective

VAEs

1lvae-fcb: 4 convolutional layers in each module (encoder, decoder, variance decoder), fully connected bottleneck, latent_dimension = M specified in the name.

1lvae-light: 4 convolutional layers in each module (encoder, decoder, variance decoder), fully convolutionnal bottleneck, latent_dimension = M specified in the name.

CAEs

CAE with an hyperprior (=two latent variables) and an autotoregressive module. See [1] and [2] for mbt and cheng architectures.

Citation

APA:

Biquard, M., Chabert, M., Genin, F., Latry, C., & Oberlin, T. (2025). Deep priors for satellite image restoration with accurate uncertainties. IEEE Transactions on Geoscience and Remote Sensing, 63, 1-16.

BibTeX:

@ARTICLE{11258607, author={Biquard, Maud and Chabert, Marie and Genin, Florence and Latry, Christophe and Oberlin, Thomas}, journal={IEEE Transactions on Geoscience and Remote Sensing}, title={Deep Priors for Satellite Image Restoration With Accurate Uncertainties}, year={2025}, volume={63}, number={}, pages={1-16}, keywords={Image restoration;Inverse problems;Uncertainty;Satellites;Satellite images;Optical imaging;Image resolution;Optical sensors;Image coding;Autoencoders;Deep regularization (DR);latent optimization;plug-and-play (PnP) methods;posterior sampling;satellite image restoration (IR);uncertainty quantification (UQ)}, doi={10.1109/TGRS.2025.3633774}}

APA:

Model Card Contact

Contact: [email protected]

References

[1] Minnen, D., Ballé, J., & Toderici, G. D. (2018). Joint autoregressive and hierarchical priors for learned image compression. Advances in neural information processing systems, 31.

[2] Cheng, Z., Sun, H., Takeuchi, M., & Katto, J. (2020). Learned image compression with discretized gaussian mixture likelihoods and attention modules. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 7939-7948).

[3] Karras, T., Laine, S., & Aila, T. (2019). A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4401-4410).

[4] Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001, July). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings eighth IEEE international conference on computer vision. ICCV 2001 (Vol. 2, pp. 416-423). IEEE.

[5] Institut Géographique National (IGN), [https://www.data.gouv.fr/datasets/pcrs/]

Downloads last month: -; Downloads are not tracked for this model. How to track