-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try to reproduce but some issues occur #3
Comments
Hi @yizhidamiaomiao, thanks for sharing your experience! |
Hi, thank you for your response! By your instructions, I do not train the "se_pre" now. I tried to directly train your model by the command: "./train.sh 0 se model_se" and evaluate at step 600075 by the command "./inference.sh 0 600075 se model_se". However, the training result seems different from your folder 'Sample Files'. Here is the link of the generated speech by the trained model 'weights-600075.pt' https://drive.google.com/drive/folders/1aK0zzC1wDToWIAoEq2dNSsdwSo9n9rWd?usp=sharing, which may not competitive with the SOTA model. Could you please help us find out what should we do in order to reproduce your result in 'Sample Files' ? |
Hi @yizhidamiaomiao, thanks for sharing the audio file! |
Thanks for your response! We download your newest code, and trained by the command ". /train.sh 0 model_se" and inferenced by command "./inference.sh 0 model_se 108000 ". The trained model is 'weights-108000.pt'. The newest results we get are in the link https://drive.google.com/drive/folders/1aK0zzC1wDToWIAoEq2dNSsdwSo9n9rWd?usp=sharing with file named as "*_enhanced_ver 7e13e6e.wav". It seems that there still be some noise in those enhanced speech. Shall we wait for step 507600? The environment I used is torchaudio '0.10.0+cu113'/ pytorch 1.10.0. Wait for any further guidance and thanks for your patient! |
Thank you for reporting the following results! I think a possible reason could be the difference between our training data. You mentioned the data you used with a 48000 sampling rate but the data I used are with a 16000 sample rate. Could you share your training data and model with me so I can try if your data/model works in my environment? Thank you again, and sorry for the inconvenience! |
I use the data directly from your link "https://datashare.ed.ac.uk/handle/10283/2791" given in the sentence "The default dataset is VOICEBANK-DEMAND dataset. You can download them from VOICEBANK-DEMAND)" in the README.md file. Actually the audio downloaded in the given website are 48k audio, and I need to write a torchaudio.resample(48k, 16k) in the function "transform" in your preprocess file to train the code. |
The data I'm using is already at a 16k sample rate, which may be different from the one in the link. Could you try adding a torchaudio.resample(48k, 16k) for both "signal" and "noisy_signal" in the __getitem__ function here in NumpyDataset? If this works, I will change the description in the README. Sorry again about this issue. |
I run the command "./train.sh 0 se model_se"
The issue is
"""""""""""""""""""""""""""""""""
Preprocessing: 0%| | 0/11572 [00:00<?, ?it/s]
Traceback (most recent call last):
File "src/cdiffuse/preprocess.py", line 140, in
main(parser.parse_args())
File "src/cdiffuse/preprocess.py", line 120, in main
list(tqdm(executor.map(spec_transform, filenames, repeat(args.dir), repeat(args.outdir)), desc='Preprocessing', total=len(filenames)))
File "/home/tiger/.local/lib/python3.7/site-packages/tqdm/std.py", line 1195, in iter
for obj in iterable:
File "/usr/lib/python3.7/concurrent/futures/process.py", line 476, in _chain_from_iterable_of_lists
for element in iterable:
File "/usr/lib/python3.7/concurrent/futures/_base.py", line 586, in result_iterator
yield fs.pop().result()
File "/usr/lib/python3.7/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
"""""""""""""""""""""""""""""""""
How to solve this?
Although "se_pre" mode can run, with the dataset provided by your link, I MUST change the sample_rate to 48000 in params.py, otherwise this code will throw a wrong information. Does this correct for the reproduce?
Also, I run for 12 hours with 4 GPU at step 156600 for "se_pre" mode. How long(how much epoch) do we need to train your model?
The text was updated successfully, but these errors were encountered: