Skip to content

Commit

Permalink
Initial commit.
Browse files Browse the repository at this point in the history
  • Loading branch information
iszihan committed Oct 16, 2020
1 parent 7873146 commit 0a05188
Show file tree
Hide file tree
Showing 34 changed files with 6,137 additions and 12 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*.dat
*.pyc
91 changes: 91 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images

Codes for the following paper:

MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images
[Benjamin Attal](https://www.battal.me/), [Selena Ling](https://www.selenaling.com/), [Aaron Gokaslan](https://skylion007.github.io/), [Christian Richardt](https://richardt.name/), [James Tompkin](www.jamestompkin.com)
ECCV 2020

![High-level overview of approach.](teaser_small.png)

See more at our [project page](http://visual.cs.brown.edu/matryodshka).

Note that our codes are based on the [code release](https://github.com/google/stereo-magnification/tree/aae16f7464d8a001b59c3bef6076ae8cb7bd043d) from the paper Stereo Maginification: Learning view synthesis using multiplane images [[1]](#1).

## Setup
* Create a conda environment from the matryodshka-gpu.yml file.
* Run `./download_glob.sh` to download the files needed for training and testing.
* Download the dataset as in Section [Replica dataset](#Replica-dataset).

## Training the model
See train.py for training the model.

* To train with transform inverse regularization, use `--transform_inverse_reg` flag.

* To train with CoordNet, use `--coord_net` flag.

* To experiment with different losses (elpips or l2), use `--which_loss` flag.
* To train with spherical weighting on loss maps, use `--spherical_attention` flag.

* To train with graph convolution network (GCN), use `--gcn` flag. Note the particular GCN architecture definition we used
is from the [Pixel2Mesh](https://github.com/nywang16/Pixel2Mesh) repo [[3]](#3).

* The current scripts support training on Replica 360 and cubemap dataset and RealEstate10K dataset.
Use `--input_type` to switch between these types of inputs (`ODS`, `PP`, `REALESTATE_PP`).

See scripts/train/*.sh for some sample scripts.

## Testing the model
See test.py for testing the model with replica-360 test set.
* When testing on video frames, e.g. test_video_640x320, include `on_video` in `--test_type` flag.
* When testing on high-resolution images, include `high_res` in `--test_type` flag.

See `scripts/test/*.sh` for sample scripts.

## Evaluation
See eval.py for evaluating the model, which saves the metric scores into a json file. We evaluate our models on
* third-view reconstruction quality
* See `scripts/eval/*-reg.sh` for a sample script.

* frame-to-frame reconstruction differences on video sequences to evaluate the effect of transform inverse regularization on temporal
consistency.
* Include `on_video` when specifying the `--eval_type` flag.
* See `scripts/eval/*-video.sh` for a sample script.

## Pre-trained model
Download models pre-trained with and without transform inverse regularization by running `./download_model.sh`.
These can also be found [here at the Brown library](https://doi.org/10.26300/spba-rp45) for archival purposes.

## Replica dataset
We rendered a 360 and a cubemap dataset for training from the Facebook Replica Dataset [[2]](#2).
This data can be found [here at the Brown library](https://doi.org/10.26300/spba-rp45) for archival purposes. You should have access to the following datasets.
* train_640x320
* test_640x320
* test_video_640x320

We also have a [fork of the Replica dataset codebase](http://coming.soon/) which can regenerate our data from scratch.
This contains customized rendering scripts that allow output of ODS, equirectangular, and cubemap projection spherical imagery, along with corresponding depth maps.

Note that the 360 dataset we release for download was rendered with an incorrect 90-degree camera rotation around the up vector and a horizontal flip. Regenerating the dataset from our released code fork with the customized rendering scripts will not include this coordinate change. The output model performance should be approximately the same._

## Exporting the model to ONNX
We export our model to ONNX by firstly converting the checkpoint into a pb file, which then gets converted to an onnx file with the [tf2onnx](https://github.com/onnx/tensorflow-onnx) module.
See `export.py` for exporting the model into .pb file.

See `scripts/export/model-name.sh` for a sample script to run `export.py`, and `scripts/export/pb2onnx.sh` for a sample script to run pb-to-onnx conversion.

## Unity Application + ONNX to TensorRT Conversion
We are still working on releasing the real-time Unity application and onnx2trt conversion scripts. Please bear with us!

## References
<a id="1">[1]</a>
Zhou, Tinghui, et al. "Stereo magnification: Learning view synthesis using multiplane images." arXiv preprint arXiv:1805.09817 (2018).
[https://github.com/google/stereo-magnification](https://github.com/google/stereo-magnification)

<a id="2">[2]</a>
Straub, Julian, et al. "The Replica dataset: A digital replica of indoor spaces." arXiv preprint arXiv:1906.05797 (2019).
[https://github.com/facebookresearch/Replica-Dataset](https://github.com/facebookresearch/Replica-Dataset)

<a id="3">[3]</a>
Wang, Nanyang, et al. "Pixel2mesh: Generating 3d mesh models from single rgb images." Proceedings of the European Conference on Computer Vision (ECCV). 2018.
[https://github.com/nywang16/Pixel2Mesh](https://github.com/nywang16/Pixel2Mesh)
15 changes: 15 additions & 0 deletions __init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#!/usr/bin/python
#
# Copyright 2018 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
2 changes: 2 additions & 0 deletions download_glob.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
#bin/bash
wget -O glob.zip https://www.dropbox.com/s/8wojwe2qomrsqqo/glob.zip?dl=1
2 changes: 2 additions & 0 deletions download_model.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
#bin/bash
wget -O pretrained-models.zip https://www.dropbox.com/s/codr6q0u1t3dtyc/pretrained-models.zip?dl=1
1 change: 1 addition & 0 deletions elpips
Submodule elpips added at 919200
Loading

0 comments on commit 0a05188

Please sign in to comment.