We used the Nerfstudio opensource library to train the networks on the clouds. the network can be trained locally using cuda on Nvidia GPUs.
Using a video \ set of images of a handheld object, we use image processing to extract the hand from the images and remove it, then we train the result frames on the cloud again using the nerfstudio library.
- creating image frames out of an input video and resizing images.
- running colmap on the images for preprocessing the data.
- training the colmap output data using nerf.
- creating a rendering result.
our input data was 219 frames of size 720x1280, we resized our frames to 480x853 before doing data preprocessing so our images after hand extraction were of resolution 480x853.
we exctracted the frames from the video and we got the following sequence.
After removing the hands and the background noise, we got the following images.
After using Colmap on the sequence, we got the poses in this sequence around the object.