Try to reproduce but some issues occur #3

yizhidamiaomiao · 2022-07-13T19:13:22Z

I run the command "./train.sh 0 se model_se"

The issue is
"""""""""""""""""""""""""""""""""
Preprocessing: 0%| | 0/11572 [00:00<?, ?it/s]
Traceback (most recent call last):
File "src/cdiffuse/preprocess.py", line 140, in
main(parser.parse_args())
File "src/cdiffuse/preprocess.py", line 120, in main
list(tqdm(executor.map(spec_transform, filenames, repeat(args.dir), repeat(args.outdir)), desc='Preprocessing', total=len(filenames)))
File "/home/tiger/.local/lib/python3.7/site-packages/tqdm/std.py", line 1195, in iter
for obj in iterable:
File "/usr/lib/python3.7/concurrent/futures/process.py", line 476, in _chain_from_iterable_of_lists
for element in iterable:
File "/usr/lib/python3.7/concurrent/futures/_base.py", line 586, in result_iterator
yield fs.pop().result()
File "/usr/lib/python3.7/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
"""""""""""""""""""""""""""""""""
How to solve this?

Although "se_pre" mode can run, with the dataset provided by your link, I MUST change the sample_rate to 48000 in params.py, otherwise this code will throw a wrong information. Does this correct for the reproduce?

Also, I run for 12 hours with 4 GPU at step 156600 for "se_pre" mode. How long(how much epoch) do we need to train your model?

neillu23 · 2022-07-15T21:58:25Z

Hi @yizhidamiaomiao, thanks for sharing your experience!
I've replaced torchaudio.load_wav() with the torchaudio.load() function in the new commit. This may fix some errors, as torchaudio.load_wav has been removed in newer versions of torchaudio.
For the second question, can you share a link to the data, and is the sample rate of the data you are using 48000?
Also, the "se_pre" step is no longer needed, as the randomly initialized CDiffuSE performs as well as the one initialized from pre-trained parameters. The model with step 507600 (no pre-training) in my experiments slightly exceeded our paper's results.
Please try the new code and let me know if you have any further questions!

yizhidamiaomiao · 2022-07-18T19:56:07Z

Hi, thank you for your response!

By your instructions, I do not train the "se_pre" now. I tried to directly train your model by the command: "./train.sh 0 se model_se" and evaluate at step 600075 by the command "./inference.sh 0 600075 se model_se".

However, the training result seems different from your folder 'Sample Files'. Here is the link of the generated speech by the trained model 'weights-600075.pt' https://drive.google.com/drive/folders/1aK0zzC1wDToWIAoEq2dNSsdwSo9n9rWd?usp=sharing, which may not competitive with the SOTA model. Could you please help us find out what should we do in order to reproduce your result in 'Sample Files' ?

neillu23 · 2022-07-18T22:19:45Z

Hi @yizhidamiaomiao, thanks for sharing the audio file!
The command you are using seems to be from a previous commit. I've updated the command style and torchaudio functions in this commit: 7e13e6e
The new command would be ". /train.sh 0 model_se" and ". /inference.sh 0 model_se 600075". Here are the results I got from my trained model 'weights-54000.pt' https://drive.google.com/drive/folders/1EIh-ZwokHcRacv20Umk9MMkld9ETdBSQ?usp=sharing.
The environment I used was torchaudio 0.9.0/ pytorch 1.9.0.
If this doesn't work for you, please let me know; thanks again!

yizhidamiaomiao · 2022-07-19T18:23:10Z

7e13e6e

Thanks for your response!

We download your newest code, and trained by the command ". /train.sh 0 model_se" and inferenced by command "./inference.sh 0 model_se 108000 ". The trained model is 'weights-108000.pt'. The newest results we get are in the link https://drive.google.com/drive/folders/1aK0zzC1wDToWIAoEq2dNSsdwSo9n9rWd?usp=sharing with file named as "*_enhanced_ver 7e13e6e.wav". It seems that there still be some noise in those enhanced speech.

Shall we wait for step 507600?

The environment I used is torchaudio '0.10.0+cu113'/ pytorch 1.10.0.

Wait for any further guidance and thanks for your patient!

neillu23 · 2022-07-31T15:14:26Z

Thank you for reporting the following results!

I think a possible reason could be the difference between our training data. You mentioned the data you used with a 48000 sampling rate but the data I used are with a 16000 sample rate. Could you share your training data and model with me so I can try if your data/model works in my environment?

Thank you again, and sorry for the inconvenience!

yizhidamiaomiao · 2022-08-01T04:35:40Z

Thank you for reporting the following results!

I think a possible reason could be the difference between our training data. You mentioned the data you used with a 48000 sampling rate but the data I used are with a 16000 sample rate. Could you share your training data and model with me so I can try if your data/model works in my environment?

Thank you again, and sorry for the inconvenience!

I use the data directly from your link "https://datashare.ed.ac.uk/handle/10283/2791" given in the sentence "The default dataset is VOICEBANK-DEMAND dataset. You can download them from VOICEBANK-DEMAND)" in the README.md file. Actually the audio downloaded in the given website are 48k audio, and I need to write a torchaudio.resample(48k, 16k) in the function "transform" in your preprocess file to train the code.

neillu23 · 2022-08-01T14:29:13Z

The data I'm using is already at a 16k sample rate, which may be different from the one in the link. Could you try adding a torchaudio.resample(48k, 16k) for both "signal" and "noisy_signal" in the __getitem__ function here in NumpyDataset? If this works, I will change the description in the README. Sorry again about this issue.

yizhidamiaomiao changed the title ~~Try to reproduce but issue happens for se mode~~ Try to reproduce but some issues occur Jul 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try to reproduce but some issues occur #3

Try to reproduce but some issues occur #3

yizhidamiaomiao commented Jul 13, 2022 •

edited

Loading

neillu23 commented Jul 15, 2022

yizhidamiaomiao commented Jul 18, 2022

neillu23 commented Jul 18, 2022

yizhidamiaomiao commented Jul 19, 2022

neillu23 commented Jul 31, 2022

yizhidamiaomiao commented Aug 1, 2022

neillu23 commented Aug 1, 2022

Try to reproduce but some issues occur #3

Try to reproduce but some issues occur #3

Comments

yizhidamiaomiao commented Jul 13, 2022 • edited Loading

neillu23 commented Jul 15, 2022

yizhidamiaomiao commented Jul 18, 2022

neillu23 commented Jul 18, 2022

yizhidamiaomiao commented Jul 19, 2022

neillu23 commented Jul 31, 2022

yizhidamiaomiao commented Aug 1, 2022

neillu23 commented Aug 1, 2022

yizhidamiaomiao commented Jul 13, 2022 •

edited

Loading