Skip to content

Latest commit

 

History

History
117 lines (84 loc) · 9.87 KB

readme.md

File metadata and controls

117 lines (84 loc) · 9.87 KB

Pytorch AutoEncoder

This repository contains pytorch auto encoder examples.

Most code are copy & pasted version from pytorch-generative.


Table of contents

  1. Tutorial
  2. Experiment result
  3. Generated Image
  4. Reference

🌱Tutorial

  1. Clone this repo.

    git clone https://github.com/hankyul2/pytorch-ae.git
  2. Train your model.

    python3 train.py -m nade
  3. Use trained model in your way. We provide code snippet for how to sample from model.

    import torch
    from torchvision.utils import save_image
    from pae.model import NADE
    
    model = NADE()
    model.load_state_dict(torch.load('your_checkpont.pth'))
    generated_img = model.sample(16, 'cpu').reshape(16, 1, 28, 28)
    save_image(generated_img, 'generated_by_NADE.jpg')

🍀Experiment Result

Negative Log Likelihood (NLL) loss on Binarized MNIST dataset.

Method Command NLL Pretrained model
NADE1 python3 train.py -m NADE 84.0 [code] [weight] [log]
MADE2 python3 train.py -m MADE 83.8 [code] [weight] [log]
PixelCNN3 python3 train.py -m PixelCNN 81.7 [code] [weight] [log]
Gated PixelCNN4 python3 train.py -m GatedPixelCNN 81.7 [code] [weight] [log]
PixelCNN++5 python3 train.py -m PixelCNN++ -b 128 --lr 2.5e-4 78.2 [code] [weight] [log]
PixelSnail6 python3 train.py -m PixelSnail -b 64 --lr 1.25e-4
PixelSnail++56 python3 train.py -m PixelSnail++ -b 64 --lr 1.25e-4
AE7
VAE8
Categorical-VAE9
VQ-VAE10
VQ-VAE-v211
dVAE12
DDPM13
CDM14

Issues

  1. MADE: we could not utilize any agnostic training tricks (order, connectivity) proposed in the original paper.
  2. PixelCNN: we could not understand and implement the PixelRNN model which is mainly discussed in the original paper.
  3. Changing batch size & learning rate: we could not train some models with default -b 512 so, we reduce batch_size to fit our gpu memory and change lr linearly.

🖼️Generated Image

Method Reconstructed Image Randomly Sampled Image
NADE1 val_49 sample_49
MADE2 val sample
PixelCNN3 val sample
Gated PixelCNN4
PixelCNN++5
PixelSnail6
PixelSnail++56
AE7
VAE8
Categorical-VAE9
VQ-VAE10
VQ-VAE-v211
dVAE12
DDPM13
CDM14

🍁Reference

Footnotes

  1. NADE: "Neural Autoregressive Distribution Estimation", JMLR, 2016 [paper] 2

  2. MADE: "Masked Autoencoder for Distribution Estimation", PMLR, 2015 [paper] 2

  3. PixelCNN: "pixel recurrent neural networks", PMLR, 2016 [paper] 2

  4. Gated PixelCNN: "Conditional Image Generation with PixelCNN Decoders", NIPS, 2016 [paper] 2

  5. PixelCNN++: "Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications", ICLR, 2017 [paper] 2 3 4

  6. PixelSnail: "Cascaded Diffusion Models for High Fidelity Image Generation", PMLR, 2018 [paper] 2 3 4

  7. AE: "Autoencoders, Unsupervised Learning, and Deep Architectures", JMLR, 2012 [paper] 2

  8. VAE: "Auto-Encoding Variational Bayes", ArXiv, 2013 [paper] 2

  9. Categorical-VAE: "Categorical Reparameterization with Gumbel-Softmax", ICLR, 2017 [paper] 2

  10. VQ-VAE: "Neural Discrete Representation Learning", NIPS, 2017 [paper] 2

  11. VQ-VAE-v2: "Generating Diverse High-Fidelity Images with VQ-VAE-2", NIPS, 2019 [paper] 2

  12. dVAE: "Zero-Shot Text-to-Image Generation", PMLR, 2021 [paper] 2

  13. DDPM: "Denoising Diffusion Probabilistic Models", NIPS, 2020 [paper] 2

  14. CDM: "Cascaded Diffusion Models for High Fidelity Image Generation", JMLR, 2022 [paper] 2