intelligent Video Sampler 3D

iVS3D is a framework for intelligent pre-processing of image sequences. iVS3D can downsample entire videos to a specific frame rate, as well as resize and crop the individual images. Furthermore, thanks to the modular architecture, developing and integrating plugins with additional algorithms is easy. We provide three plugins as baseline methods that enable an intelligent selection of suitable images and can enrich them with additional information. To filter out images affected by motion blur, we developed a plugin that detects these frames and searches the spatial neighborhood for suitable images as replacements. The second plugin uses optical flow to detect redundant images caused by a temporarily stationary camera. In our experiments, we show how this approach leads to a more balanced image sampling if the camera speed varies, and that excluding such redundant images leads to a time saving of 8.1 % for our sequences.

Link to paper submitted for the 16th International Symposium on Visual Computing (ISVC 2021).

Features

Import of images and videos (.jpg, .jpeg, .png, ..., .mp4, .mov, ...)
Import GPS-Metadata for images and video. We support:
- EXIF-Tags in JPEG and PNG
- SRT-files for DJI-Drones (Matrice, Matrice2, Mavic and Mavic3T)
- Raw text files with the syntax "# framenumber utcUnixTimestampMicrosec lat lon alt"
- GPX files
Drag and drop to import images, videos, and open projects
Plugins for selecting images based on:
- Meta information such as framerate or gps locations (Nth Frame, Geo Distance, Geo Map)
- Image features such as camera blur and optical flow (Blur Detection, Smooth Camera Movement, Stationary Camera Removal)
- Visual embedding using deep neural networks (Deep Visual Similarity)
- More Sampling-Algorithms can be added using the plugin interface
Plugins to mask challenging areas in the input images to prevent these from being processed
- Semantic Segmentation using convolutional neural networks to mask vehicles and people
- More Transformation-Algorithms can be added using the plugin interface
Export with user-selected resolution and ROI (Region of Interest)
Optimised COLMAP Interface
- Start a 3D reconstruction with 2 clicks
GPU processing with NVIDIA CUDA Toolkit API
Use multiple plugins at once with the batch processing
Can be used in a headless mode
Supported Platforms: Windows and Linux

The graphical user interface is split into five different sections. 1. Input, 2. Sampling, 3. Export, 4. Executed steps and 5. Video player with the timeline for selected images.

Getting started

This tutorial will guide you through a basic workflow with iVS3D. To follow along, download one of our latest Ready-To-Use Builds for Debian, Ubuntu or Windows, or compile from source for your platform. Download a video from the Tanks and Temples Benchmark, we use the Lighthouse video.

Step 1: Import and preview

Run iVS3D-core and import the video. This can be done using the Open Input Video action in the File-menu at the top. Alternatively, you can drag and drop the video into the application. Now you can preview the video:

Step 2: Select important images

In the timeline underneath the preview, all 8321 images are marked as selected, which is indicated by the red line. We want to reduce the number of images to speed up the reconstruction, so we use the Nth image selection-Plugin to sample down to one image per second. In the Image selection tab, select the Nth image selection plugin and hit Start selection. Now we are down to 277 selected images. To improve the quality of the images, we also run the Blur detection plugin. This will replace blurred images with better ones in the neighborhood. This might take a few minutes since we are processing 4K images.

You can see all the steps that were performed in the Executed steps tab. There can revert to an older selection of images if you wish. More plugins for automated image selection are available, see here for a detailed overwiew.

Step 3: Export selected images

Once the algorithm is finished, we can export the selected images. In the Export-tab select a fitting location and name for this set of images. We choose export in the example. You can also change the resolution of the images. To speed things up, we reduced the image resolution to HD and hit export:

Step 4: Reconstruct 3D scene Now the images have been written to the disk. Open your file explorer and navigate to the export location you chose to see the result. We can use the images to create a 3D point cloud with Colmap. For this follow the instructions here.

Plugins

There are currently 8 plugins implemented:

Plugin	Description	Supports CUDA
NthFrame	Selects every N-th frame
Blur Detection	Avoids blurry images
GeoDistance	(requires GPS) Selects images based on the distance between their GPS locations
GeoMap	(requires GPS) Displays an interactive map for the user to select GPS poses manually
Smooth Camera Movement		✅
Stationary Camera Removal	Selects images based on camera movement	✅
Deep Visual Similarity	Find images with the lowest similarity based on their visual embeddings	✅

Semantic Segmentation	Creates binary masks to exclude objects such as vehicles from the reconstruction by using convolutional neural networks for semantic image segmentation	✅

These plugins show different approaches to enhance information from an image sequence or video by either selecting images or creating additional masks to improve the 3D reconstruction process. See here for a detailed description of the above mentioned plugins.

iVS3D is built with an open plugin interface for adding new plugins. So feel free to add your own. See here for creating your own plugin.

3D Reconstruction

iVS3D does prepare the data for 3D reconstruction. For now, we do not perform the reconstruction itself. On Windows, iVS3D provides functionality to configure and start COLMAP which performs the reconstruction on the prepared data. This saves time and simplifies the reconstruction process. Make sure to install Python 3.9 or later for the reconstruction!

The next section is Linux only: OTS integration of colmap is not supported on Windows yet!

With the latest update, we introduce a seamless integration of COLMAP in our software. In the new Reconstruction tab you can configure and start colmap reconstructions, view the reconstruction progress, manage the queue and open the finished products.

Reconstruction can be configured to be executed on the local machine or a remote machine such as a GPU server. Further information:

Ready to use builds for Windows and Linux

We provide builds with and without CUDA for multiple platforms and distributions:

OS	CPU only	CUDA enabled
Windows 10/11	✅	✅
Ubuntu 22.04	✅	✅
Debian 11	✅	✅
Debian 12	✅	✅

Check the latest release to get a build for your platform!

Note that the CUDA builds support GTX 10xx and RTX series GPUS. Older GPUs or Laptop GPUs might require building iVS3D from sources with an OpenCV and CUDA build for that specific GPU.

To use the included plugin for semantic segmentation you can download the models we used in our paper: Link to models

To use other models, they have to be in the .onnx format. In addition, the plug-in requires a file that maps the classes to specific colors.

Build from source

Dependencies

iVS3D and the baseline plugins use:

OpenCV 4.7.0 with contrib modules
Qt Framework 5.15.2

For CUDA support:

NVIDIA CUDA Toolkit API 12.0

For Windows, we use MSVC compiler which is shipped with Visual Studio. On Linux, we use GCC 10 compiler.

iVS3D uses the cmake build system, which is available in the terminal or in QtCreator. For detailed instructions on building from source see here:

Build using linux terminal
Build using windows terminal
Build in Qt Creator

Tests

To create the test build add Build_Tests=ON when configuring your build with cmake. Now you can run the tests within the Test Result tab in Qt Creator or use ctest to run the test suite in your terminal.

Link to our test data

Future Work

Add remote colmap execution for windows
Add seemless colmap integration for windows

Licence

MIT see LICENSE

Citations

Knapitsch et al.: Tanks and Temples Benchmark (2017): website

[1] Knapitsch et al. (2017). Tanks and Temples: Benchmarking Large-Scale Scene Reconstruction. Proceedings of the ACM Transactions on Graphics. Lecture Notes in Computer Science, vol 36. Issue 4. Article No.:78, Pages 1 - 13. https://doi.org/10.1145/3072959.3073599

[2] Farnebäck (2003). Two-Frame Motion Estimation Based on Polynomial Expansion. Proceedings of the SCIA 2003. Lecture Notes in Computer Science, vol 2749. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45103-X_50

Authors

Patrick Binder, Daniel Brommer, Lennart Ruck, Dominik Wüst, Dominic Zahn

Fraunhofer IOSB, Karlsruhe

Supervisor: Max Hermann & Thomas Pollok

Created as part of PSE at the Karlsruhe Institute of Technology in the winter term 2020/21

Name		Name	Last commit message	Last commit date
Latest commit History 465 Commits
.github/workflows		.github/workflows
ci		ci
doc		doc
iVS3D		iVS3D
tools		tools
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
linux-dependencies.txt		linux-dependencies.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

intelligent Video Sampler 3D

Features

Getting started

Plugins

3D Reconstruction

Ready to use builds for Windows and Linux

Build from source

Dependencies

Tests

Future Work

Licence

Citations

Authors

About

Releases

Packages

Languages

License

boitumeloruf/iVS3D

Folders and files

Latest commit

History

Repository files navigation

intelligent Video Sampler 3D

Features

Getting started

Plugins

3D Reconstruction

Ready to use builds for Windows and Linux

Build from source

Dependencies

Tests

Future Work

Licence

Citations

Authors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages