Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior
Nan Huang, Ting Zhang, Yuhui Yuan, Dong Chen, Shanghang Zhang
- [2323/12/22] Code is available at GitHub!
- [2323/12/20] Paper is available at ArXiv!
- [2023/12/14] Our code and paper will open soon.
We only test on Ubuntu 22 with torch 2.0.1 & CUDA 11.7 on an A100. Make sure git, wget, Eigen are installed.
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia
apt update && apt upgrade
apt install git wget libeigen3-dev -y
Install with pip:
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
pip install git+https://github.com/facebookresearch/pytorch3d.git
pip install git+https://github.com/S-aiueo32/contextual_loss_pytorch.git@4585061
pip install ./raymarching
pip install git+https://github.com/facebookresearch/segment-anything.git
Other dependencies:
pip install -r requirements.txt
-
Zero-1-to-3 for 3D diffusion prior. We use
zero123-xl.ckpt
by default, reimplementation borrowed from Stable Diffusion repo, and is available innerf/zero123.py
.cd pretrained/zero123 wget https://zero123.cs.columbia.edu/assets/zero123-xl.ckpt cd ../../
-
MiDaS for depth estimation. We use
dpt_beit_large_512.pt
. Put it in folderpretrained/midas/
mkdir -p pretrained/midas cd pretrained/midas wget https://github.com/isl-org/MiDaS/releases/download/v3_1/dpt_beit_large_512.pt cd ../../
-
Omnidata for normal estimation.
mkdir pretrained/omnidata cd pretrained/omnidata # assume gdown is installed gdown '1wNxVO4vVbDEMEpnAi_jwQObf2MFodcBR&confirm=t' cd ../../
-
SAM to segement foreground mask of an object.
cd mask wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth cd ..
In the ./data
directory, we have included some preprocessed files that already extracted multi-modal images. If you want to test your own example, follow the following preprocessing steps and follow the file structure in ./data
. Takes seconds.
You can preprocess single image.
python preprocess_image.py --path /path/to/image
You can also preprocess images in list or directory.
bash scripts/preprocess_list.sh $GPU_IDX
bash scripts/preprocess_folder.sh $GPU_IDX /path/to/dir
Customize-It-3D uses the default DreamBooth from diffuers. To finetune multi-modal DreamBooth:
bash dreambooth/dreambooth.sh $GPU_IDX $INSTANCE_DIR $OUTPUT_DIR $CLASS_NAME $CLASS_DIR
$INSTANCE_DIR is the path to directory containing your own image.
$OUTPUT_DIR is the path where to save the trained model.
$CLASS_NAME is the text prompt describing the class of the generated sample images.
$CLASS_DIR is the path to a folder containing the generated class sample images.
For example:
bash dreambooth/dreambooth.sh 0 data/horse out/horse horse images_gen/horse
Don't forget the path of your trained model (in ./out
directory).
We use progressive training strategy to generate a full 360° 3D geometry.
bash scripts/run.sh $GPU_IDX $WORK_SPACE $REF_PATH $Enable_First_Stage $Enable_Second_Stage $TRAINED_MODEL_PATH $CLASS_NAME {More_Arugments}
As an example, run Customize-It-3D in the horse example whose trained multi-modal DreamBooth model is out/horse
using both stages in GPU 0 and set the workspace and class name as horse
, by the following command:
bash scripts/run.sh 0 horse data/horse/rgba/rgba.png 1 1 out/horse horse
- Run all examples in a folder, check the scripts
scripts/run_folder.sh
- Run all examples in a given list, check the scripts
scripts/run_list.sh
If you find this work useful, a citation will be appreciated via:
@misc{huang2023customizeit3d,
title={Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior},
author={Nan Huang and Ting Zhang and Yuhui Yuan and Dong Chen and Shanghang Zhang},
year={2023},
eprint={2312.11535},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
This code borrows heavily from Stable-Dreamfusion, many thanks to the author.