Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

报错求助 Segmentation fault (core dumped) #846

Open
super31425 opened this issue Jan 7, 2025 · 6 comments
Open

报错求助 Segmentation fault (core dumped) #846

super31425 opened this issue Jan 7, 2025 · 6 comments

Comments

@super31425
Copy link

image
运行:cosyvoice = CosyVoice2('pretrained_models/CosyVoice2-0.5B', load_jit=False, load_trt=False, fp16=False)
2025-01-07 20:37:38,786 INFO input frame rate=25
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
2025-01-07 20:37:42.280338132 [W:onnxruntime:, session_state.cc:1162 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2025-01-07 20:37:42.280368311 [W:onnxruntime:, session_state.cc:1164 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
text.cc: festival_Text_init
open voice lang map failed
Segmentation fault (core dumped)

@aluminumbox
Copy link
Collaborator

定位一下哪一行,机器或者环境问题

@fallbernana123456
Copy link

fallbernana123456 commented Jan 8, 2025

定位一下哪一行,机器或者环境问题

我也遇到了同样的问题。
定位:
Current thread 0x00007f8ee8b72740 (most recent call first):
File "lib/python3.10/site-packages/torch/_ops.py", line 854 in call
File "lib/python3.10/site-packages/torchaudio/_backend/sox.py", line 44 in load
File "lib/python3.10/site-packages/torchaudio/_backend/utils.py", line 205 in load
File "cosyvoice/utils/file_utils.py", line 42 in load_wav
prompt_speech_16k = load_wav('zero_shot_prompt.wav', 16000)

我尝试修改了load_wav方法:
'''
import librosa
def load_wav(wav, target_sr):
# 使用librosa加载音频文件
speech, sample_rate = librosa.load(wav, sr=None) # sr=None 保证使用原始采样率
# 将多声道转换为单声道
speech = librosa.to_mono(speech)
if sample_rate != target_sr:
assert sample_rate > target_sr, f'wav sample rate {sample_rate} must be greater than {target_sr}'
# 使用librosa进行重采样
speech = librosa.resample(speech, orig_sr=sample_rate, target_sr=target_sr)
return speech
'''
但是会报错:
'''
for i, j in enumerate(cosyvoice.inference_cross_lingual('在他讲述那个荒诞故事的过程中,他突然[laughter]停下来,因为他自己也被逗笑了[laughter]。', prompt_speech_16k, stream=False)):
File "cosyvoice/cli/cosyvoice.py", line 90, in inference_cross_lingual
model_input = self.frontend.frontend_cross_lingual(i, prompt_speech_16k, self.sample_rate)
File "cosyvoice/cli/frontend.py", line 162, in frontend_cross_lingual
model_input = self.frontend_zero_shot(tts_text, '', prompt_speech_16k, resample_rate)
File "cosyvoice/cli/frontend.py", line 144, in frontend_zero_shot
prompt_speech_resample = torchaudio.transforms.Resample(orig_freq=16000, new_freq=resample_rate)(prompt_speech_16k)
File "lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "lib/python3.10/site-packages/torchaudio/transforms/_transforms.py", line 979, in forward
return _apply_sinc_resample_kernel(waveform, self.orig_freq, self.new_freq, self.gcd, self.kernel, self.width)
File "lib/python3.10/site-packages/torchaudio/functional/functional.py", line 1454, in _apply_sinc_resample_kernel
if not waveform.is_floating_point():
AttributeError: 'numpy.ndarray' object has no attribute 'is_floating_point'
'''

@jhxiang
Copy link

jhxiang commented Jan 8, 2025

定位一下哪一行,机器或者环境问题

我也遇到同样的问题,定位报错是在load_wav这一行,报错如下:
c76055ca593607c982f13e050b860dba

@jhxiang
Copy link

jhxiang commented Jan 8, 2025

定位一下哪一行,机器或者环境问题

我也遇到了同样的问题。 定位: Current thread 0x00007f8ee8b72740 (most recent call first): File "lib/python3.10/site-packages/torch/_ops.py", line 854 in call File "lib/python3.10/site-packages/torchaudio/_backend/sox.py", line 44 in load File "lib/python3.10/site-packages/torchaudio/_backend/utils.py", line 205 in load File "cosyvoice/utils/file_utils.py", line 42 in load_wav prompt_speech_16k = load_wav('zero_shot_prompt.wav', 16000)

我尝试修改了load_wav方法: ''' import librosa def load_wav(wav, target_sr): # 使用librosa加载音频文件 speech, sample_rate = librosa.load(wav, sr=None) # sr=None 保证使用原始采样率 # 将多声道转换为单声道 speech = librosa.to_mono(speech) if sample_rate != target_sr: assert sample_rate > target_sr, f'wav sample rate {sample_rate} must be greater than {target_sr}' # 使用librosa进行重采样 speech = librosa.resample(speech, orig_sr=sample_rate, target_sr=target_sr) return speech ''' 但是会报错: ''' for i, j in enumerate(cosyvoice.inference_cross_lingual('在他讲述那个荒诞故事的过程中,他突然[laughter]停下来,因为他自己也被逗笑了[laughter]。', prompt_speech_16k, stream=False)): File "cosyvoice/cli/cosyvoice.py", line 90, in inference_cross_lingual model_input = self.frontend.frontend_cross_lingual(i, prompt_speech_16k, self.sample_rate) File "cosyvoice/cli/frontend.py", line 162, in frontend_cross_lingual model_input = self.frontend_zero_shot(tts_text, '', prompt_speech_16k, resample_rate) File "cosyvoice/cli/frontend.py", line 144, in frontend_zero_shot prompt_speech_resample = torchaudio.transforms.Resample(orig_freq=16000, new_freq=resample_rate)(prompt_speech_16k) File "lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "lib/python3.10/site-packages/torchaudio/transforms/_transforms.py", line 979, in forward return _apply_sinc_resample_kernel(waveform, self.orig_freq, self.new_freq, self.gcd, self.kernel, self.width) File "lib/python3.10/site-packages/torchaudio/functional/functional.py", line 1454, in _apply_sinc_resample_kernel if not waveform.is_floating_point(): AttributeError: 'numpy.ndarray' object has no attribute 'is_floating_point' '''

你的修改没问题,numpy要转成torch.tensor,然后音频保存修改成下面的形式:

for i, j in enumerate(cosyvoice.inference_zero_shot('收到好友从远方寄来的生日礼物,那份意外的惊喜与深深的祝福让我心中充满了甜蜜的快乐,笑容如花儿般绽放。', '希望你以后能够做的比我还好呦。', prompt_speech_16k, stream=False)):
    audio_np = j['tts_speech'].squeeze(0).cpu().numpy()
    sf.write('zero_shot_{}.wav'.format(i), audio_np, cosyvoice.sample_rate)

这样的话我就没有报错了,我使用torchaudio库的方法都会报段错误

@fallbernana123456
Copy link

frontend_cross_lingual

不行。我在for i, j in enumerate(cosyvoice.inference_zero_shot('收到好友从远方寄来的生日礼物,那份意外的惊喜与深深的祝福让我心中充满了甜蜜的快乐,笑容如花儿般绽放。', '希望你以后能够做的比我还好呦。', prompt_speech_16k, stream=False)):
这一行就报错了。

@jhxiang
Copy link

jhxiang commented Jan 9, 2025

定位一下哪一行,机器或者环境问题

我也遇到同样的问题,定位报错是在load_wav这一行,报错如下: c76055ca593607c982f13e050b860dba

conda install ffmpeg 解决torchaudio段错误问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants