Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding data balancing #33

Open
mgarbade opened this issue Jan 9, 2018 · 5 comments
Open

Understanding data balancing #33

mgarbade opened this issue Jan 9, 2018 · 5 comments

Comments

@mgarbade
Copy link

mgarbade commented Jan 9, 2018

In your paper you state:
"For each training volume containing N occupied voxels,

  • we randomly sample 2N empty voxels from occluded regions for training.

Voxels in

  • free space,
  • outside the field of view, or
  • outside the room

are ignored."

I assume all that is done in this function. However I'm not sure if I do understand it correclty.

Now my question: How exactly do you sample the data?
What do you mean with "2N empty voxels from occluded regions"?

Given this image from your paper
screenshot from 2018-01-09 11 23 41
It would mean that you are sampling empty voxels from the blue ("occluded") area only. The red area ("observed surface") is completely ignored.

Is that correct?

@mgarbade
Copy link
Author

So after checking the code I'm quite sure that there is a missunderstanding on my side. At least the number of voxels that are randomly sampled from the empty occluded space (blue area without voxels occupied by objects) is far lower than 2*N where N is the number of occupied voxels). I'll come back if I found out how exactly the sampling works...

@mgarbade
Copy link
Author

    // Find number of occupied voxels
    // Save voxel indices of background
    // Set label weights of occupied voxels as 1
    int num_occ_voxels = 0;
    std::vector<int> bg_voxel_idx;
    float *occupancy_weight = new float[num_label_voxels];
    float *segmentation_weight = new float[num_label_voxels];

    memset(occupancy_weight, 0, num_label_voxels * sizeof(float));
    memset(segmentation_weight, 0, num_label_voxels * sizeof(float));

    LOG(INFO) << "checkkk 6";
    for (int i = 0; i < num_label_voxels; ++i) { 
      if (float(occupancy_label_downscale[i]) > 0) { // if voxel is full
          if (tsdf_data_downscale[i] < -0.5) {       // if voxel is occluded
            // foreground voxels in unobserved region
            num_occ_voxels++;
            occupancy_weight[i] = float(occupancy_class_weight[1]);
          } 
      } else {                                       // if voxel is empty
        if (tsdf_data_downscale[i] < -0.5) {         // if voxel is occluded
          bg_voxel_idx.push_back(i); // background voxels in unobserved region
        } 
      }

      if (float(segmentation_label_downscale[i]) > 0 && float(segmentation_label_downscale[i]) < 255) { // if voxel is full and not 255
        // foreground voxels within room
        if (surf_only){                                                                                 // surf_only == false
          if(abs(tsdf_data_downscale[i]) < 0.5){
            segmentation_weight[i] = float(segmentation_class_weight[ (int) segmentation_label_downscale[i] ]);
          }
        }else{
          segmentation_weight[i] = float(segmentation_class_weight[ (int) segmentation_label_downscale[i] ]);  // segmentation_weight = class_weight[label_id]
        }
      }

    }
    LOG(INFO) << "checkkk 7";
    // Raise the weight for a few indices of background voxels
    std::random_device tmp_rand_rd;
    std::mt19937 tmp_rand_mt(tmp_rand_rd());
    int segnegcout = 0;
    int segnegtotal = floor(sample_neg_obj_ratio * (float) num_occ_voxels);

    if (bg_voxel_idx.size() > 0) {
      std::uniform_real_distribution<double> tmp_rand_dist(0, (float) (bg_voxel_idx.size()) - 0.0001);
      for (int i = 0; i < num_occ_voxels; ++i) {                                                      // Iter over num_occ_voxels = #foreground voxels in unobserved region = voxel is full + voxel is occluded
        int rand_idx = (int) (std::floor(tmp_rand_dist(tmp_rand_mt)));                                // Get random idx between 0 and bg_voxel_idx.size

        occupancy_weight[ bg_voxel_idx[rand_idx] ] = float(occupancy_class_weight[0]);                // 

        if (segnegcout < segnegtotal && float(segmentation_label_downscale[ bg_voxel_idx[rand_idx] ]) < 255 ) { // Bug: segnegcout < segnegtotal is always true!
          // background voxels within room
          segmentation_weight[ bg_voxel_idx[rand_idx] ] = float(segmentation_class_weight[0]);        // Add at max "num_occ_voxels" empty occluded voxels (bg_voxel_idx) unless they belong to 255
          segnegcout++;                                                                               // Bug: According to paper 2N empty occluded voxels should have been added
        }
      }
    }

@mgarbade
Copy link
Author

mgarbade commented Jan 12, 2018

So the number of points sampled from empty occluded space varies and is not 2N (which is also not possible since often there aren't 2N empty occluded voxels. Rather the sampling goes over all pixels with TSDF value < -0.5 including occupied voxels and voxels outside the room. Voxels with gt label == 255 are then dismissed.

@KnightOfTheMoonlight
Copy link

KnightOfTheMoonlight commented May 7, 2018

Hi, @mgarbade I have the same questions as you are before. From my understanding of the code for data balancing from the code of suncg_data_layer.cu you past above, variable segmentation_weight is the key for data balance.

For occluded data with object classes [1:11] for training (which meets Dtype(segmentation_label_downscale[i]) > 0 && Dtype(segmentation_label_downscale[i]) < 255), segmentation_weight is given by the corresponding segmentation label weight.

For empty data for training, they only pick the empty voxel from the occluded voxels. You can get this by finding the way how the calculate the variable bg_voxel_idx. And for one reference, they just treat occupancy_label_downscale as 0 for segmentation label [0, 255], 1 for [1:11].

When you check the evaluation code, you could find that they only evaluate the performance of the occluded areas.

@KnightOfTheMoonlight
Copy link

KnightOfTheMoonlight commented May 7, 2018

And from their code, the number of empty voxels for training is definitely not 2*N, in fact, the number will be < N or = N

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants