We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
My training is stuck on the first epoch and I cant seem to figure out why.
Below is what I see when I run python train1.py. Any help would be fantastic.
[fakarim@blipp78 deep-voice-conversion]$ vim log1_1.txt net1/cbhg/highwaynet_1/dense2/bias:0 [64] 64 net1/cbhg/highwaynet_2/dense1/kernel:0 [64, 64] 4096 net1/cbhg/highwaynet_2/dense1/bias:0 [64] 64 net1/cbhg/highwaynet_2/dense2/kernel:0 [64, 64] 4096 net1/cbhg/highwaynet_2/dense2/bias:0 [64] 64 net1/cbhg/highwaynet_3/dense1/kernel:0 [64, 64] 4096 net1/cbhg/highwaynet_3/dense1/bias:0 [64] 64 net1/cbhg/highwaynet_3/dense2/kernel:0 [64, 64] 4096 net1/cbhg/highwaynet_3/dense2/bias:0 [64] 64 net1/cbhg/gru/bidirectional_rnn/fw/gru_cell/gates/kernel:0 [128, 128] 16384 net1/cbhg/gru/bidirectional_rnn/fw/gru_cell/gates/bias:0 [128] 128 net1/cbhg/gru/bidirectional_rnn/fw/gru_cell/candidate/kernel:0 [128, 64] 8192 net1/cbhg/gru/bidirectional_rnn/fw/gru_cell/candidate/bias:0 [64] 64 net1/cbhg/gru/bidirectional_rnn/bw/gru_cell/gates/kernel:0 [128, 128] 16384 net1/cbhg/gru/bidirectional_rnn/bw/gru_cell/gates/bias:0 [128] 128 net1/cbhg/gru/bidirectional_rnn/bw/gru_cell/candidate/kernel:0 [128, 64] 8192 net1/cbhg/gru/bidirectional_rnn/bw/gru_cell/candidate/bias:0 [64] 64 net1/dense/kernel:0 [128, 61] 7808 net1/dense/bias:0 [61] 61^[[36m Total #vars=58, #param=363389 (1.39 MB assuming all float32)^[[0m ^[[32m[0611 19:17:35 @base.py:158]^[[0m Setup callbacks graph ... ^[[32m[0611 19:17:35 @summary.py:34]^[[0m Maintain moving average summary of 0 tensors. ^[[32m[0611 19:17:36 @base.py:174]^[[0m Creating the session ... 2018-06-11 19:17:37.187985: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2018-06-11 19:17:41.938789: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53 pciBusID: 0000:0a:00.0 totalMemory: 15.77GiB freeMemory: 15.36GiB 2018-06-11 19:17:41.938844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0 2018-06-11 19:17:42.321558: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix: 2018-06-11 19:17:42.321609: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0 2018-06-11 19:17:42.321619: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N 2018-06-11 19:17:42.322000: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15990 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:0a:00.0, compute capability: 7.0) ^[[32m[0611 19:17:43 @base.py:182]^[[0m Initializing the session ... ^[[32m[0611 19:17:43 @base.py:189]^[[0m Graph Finalized. 2018-06-11 19:17:45.195791: W tensorflow/core/kernels/queue_base.cc:285] _0_QueueInput/input_queue: Skipping cancelled dequeue attempt with queue not closed ^[[32m[0611 19:17:45 @concurrency.py:36]^[[0m Starting EnqueueThread QueueInput/input_queue ... ^[[32m[0611 19:17:45 @graph.py:70]^[[0m Running Op sync_variables_from_main_tower ... ^[[32m[0611 19:17:45 @base.py:209]^[[0m Start Epoch 1 ... ^M 0%| |0/100[00:00<?,?it/s] "log1_1.txt" [noeol] 141L, 11139C
The text was updated successfully, but these errors were encountered:
me too, can you tell me how to fix it.
Sorry, something went wrong.
@fazlekarim why did you close the issue? if you get any solution please share with us. I have the same issue in train1.py.
No branches or pull requests
My training is stuck on the first epoch and I cant seem to figure out why.
Below is what I see when I run python train1.py. Any help would be fantastic.
[fakarim@blipp78 deep-voice-conversion]$ vim log1_1.txt
net1/cbhg/highwaynet_1/dense2/bias:0 [64] 64
net1/cbhg/highwaynet_2/dense1/kernel:0 [64, 64] 4096
net1/cbhg/highwaynet_2/dense1/bias:0 [64] 64
net1/cbhg/highwaynet_2/dense2/kernel:0 [64, 64] 4096
net1/cbhg/highwaynet_2/dense2/bias:0 [64] 64
net1/cbhg/highwaynet_3/dense1/kernel:0 [64, 64] 4096
net1/cbhg/highwaynet_3/dense1/bias:0 [64] 64
net1/cbhg/highwaynet_3/dense2/kernel:0 [64, 64] 4096
net1/cbhg/highwaynet_3/dense2/bias:0 [64] 64
net1/cbhg/gru/bidirectional_rnn/fw/gru_cell/gates/kernel:0 [128, 128] 16384
net1/cbhg/gru/bidirectional_rnn/fw/gru_cell/gates/bias:0 [128] 128
net1/cbhg/gru/bidirectional_rnn/fw/gru_cell/candidate/kernel:0 [128, 64] 8192
net1/cbhg/gru/bidirectional_rnn/fw/gru_cell/candidate/bias:0 [64] 64
net1/cbhg/gru/bidirectional_rnn/bw/gru_cell/gates/kernel:0 [128, 128] 16384
net1/cbhg/gru/bidirectional_rnn/bw/gru_cell/gates/bias:0 [128] 128
net1/cbhg/gru/bidirectional_rnn/bw/gru_cell/candidate/kernel:0 [128, 64] 8192
net1/cbhg/gru/bidirectional_rnn/bw/gru_cell/candidate/bias:0 [64] 64
net1/dense/kernel:0 [128, 61] 7808
net1/dense/bias:0 [61] 61^[[36m
Total #vars=58, #param=363389 (1.39 MB assuming all float32)^[[0m
^[[32m[0611 19:17:35 @base.py:158]^[[0m Setup callbacks graph ...
^[[32m[0611 19:17:35 @summary.py:34]^[[0m Maintain moving average summary of 0 tensors.
^[[32m[0611 19:17:36 @base.py:174]^[[0m Creating the session ...
2018-06-11 19:17:37.187985: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-06-11 19:17:41.938789: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53
pciBusID: 0000:0a:00.0
totalMemory: 15.77GiB freeMemory: 15.36GiB
2018-06-11 19:17:41.938844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-11 19:17:42.321558: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-11 19:17:42.321609: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
2018-06-11 19:17:42.321619: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
2018-06-11 19:17:42.322000: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 15990 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:0a:00.0, compute capability: 7.0)
^[[32m[0611 19:17:43 @base.py:182]^[[0m Initializing the session ...
^[[32m[0611 19:17:43 @base.py:189]^[[0m Graph Finalized.
2018-06-11 19:17:45.195791: W tensorflow/core/kernels/queue_base.cc:285] _0_QueueInput/input_queue: Skipping cancelled dequeue attempt with queue not closed
^[[32m[0611 19:17:45 @concurrency.py:36]^[[0m Starting EnqueueThread QueueInput/input_queue ...
^[[32m[0611 19:17:45 @graph.py:70]^[[0m Running Op sync_variables_from_main_tower ...
^[[32m[0611 19:17:45 @base.py:209]^[[0m Start Epoch 1 ...
^M 0%| |0/100[00:00<?,?it/s]
"log1_1.txt" [noeol] 141L, 11139C
The text was updated successfully, but these errors were encountered: