Skip to content

Official PyTorch repo for JoJoGAN: One Shot Face Stylization with VIDEO and TRAINING

License

Notifications You must be signed in to change notification settings

bycloudai/JoJoGAN-Training-Windows

 
 

Repository files navigation

JoJoGAN: One Shot Face Stylization w/ video results & training script

arXiv Open In Colab Replicate Hugging Face Spaces Wandb Report

This is the PyTorch implementation of JoJoGAN: One Shot Face Stylization.

Abstract:
While there have been recent advances in few-shot image stylization, these methods fail to capture stylistic details that are obvious to humans. Details such as the shape of the eyes, the boldness of the lines, are especially difficult for a model to learn, especially so under a limited data setting. In this work, we aim to perform one-shot image stylization that gets the details right. Given a reference style image, we approximate paired real data using GAN inversion and finetune a pretrained StyleGAN using that approximate paired data. We then encourage the StyleGAN to generalize so that the learned style can be applied to all other images.

This is a forked Windows Installation Tutorial and the main codes will not be updated

Follow this YouTube tutorial to understand the installation process more easily and if you have any questions feel free to join my discord and ask there. Codes are mostly taken from the official google colab, and modified for local use.

Setup Environment

Step 0: Download anaconda

Download this repository

Step 1:

conda create -n jojo python=3.7
conda activate jojo
cd <your codes file directory here>

Step 2 option 1: 30 series NVIDIA GPU

conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

Step 2 option 2: none 30 series NVIDIA GPU

conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch

Step 2 option 3: CPU only (no NVIDIA GPU)

conda install pytorch torchvision torchaudio cpuonly -c pytorch

Step 3

pip install -r requirements.txt
pip install cmake
pip install dlib==19.20
conda install -c conda-forge ffmpeg

Download Models

checkpoints:

pretrained style models (optional):

model structure

📂JoJoGAN/ # this is root
├── 📂models/
│	├── 📜stylegan2-ffhq-config-f.pt
│	├── 📜e4e_ffhq_encode.pt
│	├── 📜restyle_psp_ffhq_encode.pt
│	├── 📜dlibshape_predictor_68_face_landmarks.dat
│	├── 📜<any pretrained style models>
│	│...
│...

Evaluate a Pretrained Style Model on Image

Download the pretrained style model and put it under the models folder like in the diagram shown above. Put the input image in the test_input folder, in the following image_name, you don't need to provide the file path, just the file name.

python evaluate.py --input <image_name> --model_name <model_name> --seed <random_seed> --device <cuda/cpu>

eg.

python evaluate.py --device cuda --input iu.jpeg --model_name jojo --seed 3000

Evaluate a Pretrained Style Model on Video

Put the input video in the test_input folder, in the following video_name, you don't need to provide the file path, just the file name.

python evaluate_video.py --input <video_name> --model_name <model_name> --seed <random_seed> --device <cuda/cpu>

eg.

python evaluate_video.py --device cuda --input elon.mp4 --model_name jojo --seed 3000

Train a Custom Model

Add images with the same style into the folder style_images. See inside the folder for example.

python train_custom_style.py --model_name <new_name> --alpha <alpha_value> --preserve_color <True/False> --num_iter <number_of_iterations> --device <cuda/cpu>
  • model_name: give your new model a name, maybe based on the style images?
  • alpha: the alpha value that'll determine the strength of the style. 0 = strongest, 1 = weakest. Float value between 0 and 1
  • preserve_color: To whether preserve the color from the style images. This should be a boolean True or False
  • num_iter: Number of iterations for the training. Usually 300 ~ 500 iter would be fine
  • device: If you don't have NVIDIA GPU with CUDA, use cpu. Otherwise, cuda (basically the default and you don't need to declare)

eg.

python train_custom_style.py --model_name custom --alpha 0.0 --preserve_color False --num_iter 300 --device cuda

To evaluate the model, follow the previous step will do, just change the model_name to the one you just created. It'll just be like:

python evaluate.py --device cuda --input iu.jpeg --model_name custom --seed 3000

Force training (manual align style image)

When your style's face cannot be detected you can try using force_train.py. This is how I trained the colossal model. Save this image, drag it into photoshop or photopea, match the style image you want with the features of this colossal titan. Eyes to eyes, nose to nose, ears to ears, jaws to jaws if possible. The more accurate the better. Drag it into the style_images_aligned folder and do:

python force_train.py --model_name <insert_name_here> --force_name <insert_style_image_here> --num_iter 300 --device cuda

eg.

python force_train.py --model_name colossal --force_name colossal --num_iter 300 --device cuda

and after getting the trained model, you can evaluate normally like any other models.

my fork edits end here.

Updates

  • 2021-12-22 Integrated into Replicate using cog. Try it out Replicate

  • 2022-02-03 Updated the paper. Improved stylization quality using discriminator perceptual loss. Added sketch model

  • 2021-12-26 Added wandb logging. Fixed finetuning bug which begins finetuning from previously loaded checkpoint instead of the base face model. Added art model


  • 2021-12-25 Added arcane_multi model which is trained on 4 arcane faces instead of 1 (if anyone has more clean data, let me know!). Better preserves features

  • 2021-12-23 Paper is uploaded to arxiv.

  • 2021-12-22 Integrated into Huggingface Spaces 🤗 using Gradio. Try it out Hugging Face Spaces

  • 2021-12-22 Added pydrive authentication to avoid download limits from gdrive! Fixed running on cpu on colab.

How to use

Everything to get started is in the colab notebook.

Citation

If you use this code or ideas from our paper, please cite our paper:

@article{chong2021jojogan,
  title={JoJoGAN: One Shot Face Stylization},
  author={Chong, Min Jin and Forsyth, David},
  journal={arXiv preprint arXiv:2112.11641},
  year={2021}
}

Acknowledgments

This code borrows from StyleGAN2 by rosalinity, e4e. Some snippets of colab code from StyleGAN-NADA

About

Official PyTorch repo for JoJoGAN: One Shot Face Stylization with VIDEO and TRAINING

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.4%
  • Other 0.6%