Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error adding images to training #26

Open
SharhadBashar opened this issue Nov 17, 2021 · 3 comments
Open

Error adding images to training #26

SharhadBashar opened this issue Nov 17, 2021 · 3 comments

Comments

@SharhadBashar
Copy link

SharhadBashar commented Nov 17, 2021

I want to modify the images used for training: i want to add my own images. how do i go around doing that?
here is what i did:

  1. I added the raw images to the JPEGImages folder
  2. I added the segmented versions to the SegmentationClassAug folder
  3. I added the image paths to train_aug.txt in voc12
  4. I added their labels to train_label.npy but not sure I added them correctly
  5. I added their image name and class label to cls_labels.npy in the same format as the other entires
  6. did not need to add them to 20_class_labels.npy, as they were already there
    I added about 1500 images, extract features extracts all the 12000+ images, but when i run create pseudo label, in the line instead of printing (12000, 200) (12000, 20) it prints (18000, 200) (18000, 20)
    What am i doing wrong, or am i missing any steps?

Heres the error:

18605 18605
18605 18605 (16458, 20)
Traceback (most recent call last):
  File "create_pseudo_label.py", line 202, in <module>
    train_filename_list, train_label_200, train_label_20 = create_train_data(merge_filename_list, new_label_list, keep_idx_list)
  File "create_pseudo_label.py", line 126, in create_train_data
    train_label_20.append(label_20[idx])
IndexError: index 16460 is out of bounds for axis 0 with size 16458
k_cluster: 10
@WeiChihChern
Copy link

  1. did not need to add them to 20_class_labels.npy, as they were already there
    I added about 1500 images, extract features extracts all the 12000+ images, but when i run create pseudo label, in the line instead of printing (12000, 200) (12000, 20) it prints (18000, 200) (18000, 20)

Do you mean you didn't change the '20_class_labels.npy' at all? If you include your own images, you should add your labels to '20_class_labels.npy'. If your images contain one class per image, then you can simply append your labels to the 20_class_labels.npy. If not... well it's going to be complicated.

I think the create_pseudo_label.py should generate the label of '20_class_labels.npy' automatically because the new label length is based on the processed results from the functions of def make_filename_class_dict, def merge_filename_class_dict, def generate_repeat_list, and def remove_duplicate_label, but somehow it does not.

@WeiChihChern
Copy link

Let's discuss here so others who have similar issues can refer to this.

I want to include the validation images in training as well. Thats why i dont need to make any changes to 20_class_labels.npy
But I am not sure what to do with train_label.npy. These images are already included in cls_labels.npy

Any ideas what i should do next?
I added the images to train_aug.txt, but dont know what to do from there

Took a glance at the training code, it does not contain a validation set. You can repeat the process of what the authors do for training set: feature extraction and pseudo label generation, and apply them for your validation set. Then load it in the training code. If it's a validation set, you would probably not want to mix it with the training set.

I am not the author(s), so my interpretation could be incorrect!

@SharhadBashar
Copy link
Author

ok, so i found the cause of my issue
20_class_labels.npy has a size of 16458, whereas adding new images makes keep_idx_list go over that (18605), since there are more images
so my question is, how is this file 20_class_labels.npy created?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants