audio file as input parameter for model.transcribe works well but ndarray-typed parameter captured with sounddevice does not work #1253

cyflhn · 2025-02-21T01:20:24Z

model.transcribe works well when I use an audio file as an input parameter. But when I use sounddevice to record a period of speech and save the speech result as ndarray and send it directly for model.transcribe , it cannot recognize speech.
But I save the speech recorded by sounddevice as an audio file and then use this file as input paramter for model.transcribe , the speech can be recognized. What is the problem? Is there any specific format requirement for ndarray parameter?

The text was updated successfully, but these errors were encountered:

MahmoudAshraf97 · 2025-02-24T20:55:23Z

make sure the array is mono and sampled at 16khz float32

cyflhn · 2025-02-25T01:04:46Z

mono

what does mono array mean? Could you please give me an example? I am not quite into video technique. Here is my code for recording speech:
recording = sd.rec(int(duration * fs), samplerate=fs, channels=2, device=device_index)

MahmoudAshraf97 · 2025-02-25T07:22:08Z

Mono means single channel
And sr should be 16000

cyflhn · 2025-02-25T11:05:50Z

Mono means single channel And sr should be 16000

I modify my code according to your suggestion, but still did not work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

audio file as input parameter for model.transcribe works well but ndarray-typed parameter captured with sounddevice does not work #1253

audio file as input parameter for model.transcribe works well but ndarray-typed parameter captured with sounddevice does not work #1253

cyflhn commented Feb 21, 2025

MahmoudAshraf97 commented Feb 24, 2025

cyflhn commented Feb 25, 2025

MahmoudAshraf97 commented Feb 25, 2025

cyflhn commented Feb 25, 2025

audio file as input parameter for model.transcribe works well but ndarray-typed parameter captured with sounddevice does not work #1253

audio file as input parameter for model.transcribe works well but ndarray-typed parameter captured with sounddevice does not work #1253

Comments

cyflhn commented Feb 21, 2025

MahmoudAshraf97 commented Feb 24, 2025

cyflhn commented Feb 25, 2025

MahmoudAshraf97 commented Feb 25, 2025

cyflhn commented Feb 25, 2025