slice index out of bounds problem during training #36

numancelik34 · 2021-02-06T00:57:02Z

Getting following error while trying to adapt into my dataset. My dataset is composed of 256,256,3 for raw images and 256,256,1 for labels; but I use different type of images and not using slices. I try to work with binary segmentation problem.

ValueError: slice index 1 of dimension 3 out of bounds. for 'strided_slice_27' (op: 'StridedSlice') with input shapes: [?,256,256,1], [4], [4], [4] and with computed input tensors: input[1] = <0 0 0 1>, input[2] = <0 0 0 2>, input[3] = <1 1 1 1>.

Why getting this problem?

Thanks

cchen-cc · 2021-02-08T14:01:19Z

Hi, I never had this issue before. From reading the error information it's difficult for me to guess the reason. You may want to have a step-by-step debugging about the data reading.

numancelik34 · 2021-02-08T14:19:31Z

Hi again @cchen-cc -

I think because my dataset may have different format what you have. And problem is probably caused by defining labels in dataloader.py
Let's say I have train -> image(256,256,3, dtype = float32) labels (256,256,1, dtype = float32); and target sets -> image(256,256,1, dtype = float32) and labels (256,256,1, dtype = float32).

My aim is to do a binary segmentation, thus I have slightly changed the dataloader.py

--> label_vol = tf.decode_raw(parser['label_vol'], tf.float32)
label_vol = tf.reshape(label_vol, label_size)
label_vol = tf.slice(label_vol, [0, 0, 0], label_size)

     batch_y = tf.one_hot(tf.cast(tf.squeeze(label_vol), tf.uint8), 1) -- because my problem is binary segmentation (0 and 1)
    return data_vol[:, :, :], batch_y

Let me know please if these are correct for defining data loader for binary segmentation problem?

Thanks

cchen-cc · 2021-02-08T14:37:59Z

I'm not very sure, but I think the depth set in tf.one_hot to be 1 seems incorrect.
if you use sigmoid cross entropy to calculate loss for binary segmentation task, you don't need to change your label to one-hot format. And if you want to use softmax cross entropy, the the depth in tf.one_hot should be 2.

numancelik34 · 2021-02-08T16:23:32Z

(@cchen-cc ) Now getting the error below:

OutOfRangeError (see above for traceback): RandomShuffleQueue '_1_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 8, current size 0)
[[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]

muddasser-mm · 2021-02-10T23:59:07Z

@cchen-cc Hello. Sorry for commenting in between the discussion. But I'm facing the same error when trying to train, as mentioned above by @numancelik34 :

Error - tensorflow.python.framework.errors_impl. OutOfRangeError: RandomShuffleQueue '_3_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 8, current size 0)

I did refer to your answer relating to the same issue #13 #21 :
' This type of error usually happens when data cannot be correctly read.
There can be various causes for why data are not read correctly and usually careful debugging is needed.
Usually you could check the following things:

whether the data paths stored in imagea_list and imageb_list are correct
whether the data written into the tfrecords are with correct shape and type
whether the data reading in the data_load.py is with correct shape and type'

I paths and data in the variables is correctly taken, but still the issue seems to occur when shuffle_batch() is executed during sess.run(self.inputs) in main.py. @cchen-cc it would be great if you could provide any more inputs or suggestions to debug the issue.

numancelik34 · 2021-02-11T11:26:21Z

Same here @cchen-cc
I have checked already and here is the details what I have regarding the shape and type:
1- data paths are stored in imagea_list and imageb_list correctly
2- the data written into tfrecords with 256,256,3 for image and 256,256,1 for the labels for both train and target datasets. dtype of both image sets is float32.
3 - while reading this tfrecords with the same shape and dtype --> 256,256,3 (image) and 156,256,1 (for labels). Dtype to read the files is float32 as well.

Let me know please if there is missing point here.

Thanks

muddasser-mm · 2021-02-11T16:39:29Z

@cchen-cc Thanks. The issue indeed was with the paths. It's solved for me now.

numancelik34 · 2021-02-11T18:54:30Z

Hi @muddasser27 - what type of dataset you are using - same with the one that SIFA paper implemented? or your own dataset? If it is your own dataset, then how is your data is composed? like the usual shape arrangement - 3 dims and no any slices?

Thanks

muddasser-mm · 2021-02-11T19:40:15Z

Hi @numancelik34 . I'm using the same dataset used in SIFA and shared by the the authors.

cchen-cc · 2021-02-15T03:31:53Z

@muddasser27 Congrats you've found the cause.
@numancelik34 Since you are using your own dataset, you may want to check your data conversion into tfrecords more carefully.

chke097 · 2022-06-07T00:59:12Z

@muddasser27 Here I see that you have successfully trained using the data set of the original author. I also try to train using the data set of the original author, but this error keeps appearing. How can I solve it? Or would you like to share your modified code？

[07:35<73:03:58, 26.32s/it]iter 6: processing time 0.4044947624206543
0%| | 7/10000 [07:36<49:32:30, 17.85s/it]iter 7: processing time 0.3983621597290039
0%| | 8/10000 [07:36<158:27:30, 57.09s/it]
Traceback (most recent call last):
File "/data2/xxb/anaconda/envs/tensorflow10/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1278, in _do_call
return fn(*args)
File "/data2/xxb/anaconda/envs/tensorflow10/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1263, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/data2/xxb/anaconda/envs/tensorflow10/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: RandomShuffleQueue '_2_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "main.py", line 626, in
main(config_filename='./config_param.json')
File "main.py", line 622, in main
sifa_model.train()
File "main.py", line 411, in train
images_i, images_j, gts_i, gts_j= sess.run(self.inputs)
File "/data2/xxb/anaconda/envs/tensorflow10/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 877, in run
run_metadata_ptr)
File "/data2/xxb/anaconda/envs/tensorflow10/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1100, in _run
feed_dict_tensor, options, run_metadata)
File "/data2/xxb/anaconda/envs/tensorflow10/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1272, in _do_run
run_metadata)
File "/data2/xxb/anaconda/envs/tensorflow10/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1291, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: RandomShuffleQueue '_2_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]

Caused by op 'shuffle_batch', defined at:
File "main.py", line 626, in
main(config_filename='./config_param.json')
File "main.py", line 622, in main
sifa_model.train()
File "main.py", line 355, in train
self.inputs = data_loader.load_data(self._source_train_pth, self._target_train_pth, True)
File "/data2/xxb/SIFA-master/SIFA-master/data_loader.py", line 112, in load_data
1,50,10,num_threads=4)
File "/data2/xxb/anaconda/envs/tensorflow10/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 1300, in shuffle_batch
name=name)
File "/data2/xxb/anaconda/envs/tensorflow10/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 846, in _shuffle_batch
dequeued = queue.dequeue_many(batch_size, name=name)
File "/data2/xxb/anaconda/envs/tensorflow10/lib/python3.6/site-packages/tensorflow/python/ops/data_flow_ops.py", line 476, in dequeue_many
self._queue_ref, n=n, component_types=self._dtypes, name=name)
File "/data2/xxb/anaconda/envs/tensorflow10/lib/python3.6/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3480, in queue_dequeue_many_v2
component_types=component_types, timeout_ms=timeout_ms, name=name)
File "/data2/xxb/anaconda/envs/tensorflow10/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/data2/xxb/anaconda/envs/tensorflow10/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func
return func(*args, **kwargs)
File "/data2/xxb/anaconda/envs/tensorflow10/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3155, in create_op
op_def=op_def)
File "/data2/xxb/anaconda/envs/tensorflow10/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1717, in init
self._traceback = tf_stack.extract_stack()

OutOfRangeError (see above for traceback): RandomShuffleQueue '_2_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 1, current size 0)
[[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]

This problem always occurs when iterating over a certain number of steps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

slice index out of bounds problem during training #36

slice index out of bounds problem during training #36

numancelik34 commented Feb 6, 2021

cchen-cc commented Feb 8, 2021

numancelik34 commented Feb 8, 2021

cchen-cc commented Feb 8, 2021

numancelik34 commented Feb 8, 2021

muddasser-mm commented Feb 10, 2021 •

edited

Loading

numancelik34 commented Feb 11, 2021

muddasser-mm commented Feb 11, 2021

numancelik34 commented Feb 11, 2021

muddasser-mm commented Feb 11, 2021

cchen-cc commented Feb 15, 2021

chke097 commented Jun 7, 2022

slice index out of bounds problem during training #36

slice index out of bounds problem during training #36

Comments

numancelik34 commented Feb 6, 2021

cchen-cc commented Feb 8, 2021

numancelik34 commented Feb 8, 2021

cchen-cc commented Feb 8, 2021

numancelik34 commented Feb 8, 2021

muddasser-mm commented Feb 10, 2021 • edited Loading

numancelik34 commented Feb 11, 2021

muddasser-mm commented Feb 11, 2021

numancelik34 commented Feb 11, 2021

muddasser-mm commented Feb 11, 2021

cchen-cc commented Feb 15, 2021

chke097 commented Jun 7, 2022

muddasser-mm commented Feb 10, 2021 •

edited

Loading