-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect language and transcribe in separate steps #1245
Comments
yes you can faster-whisper/faster_whisper/transcribe.py Lines 1726 to 1754 in 9e657b4
|
@heimoshuiyu detect_language expects audio or features arguments as np.ndarray. There is no method to get the audio or features in the required format. |
There is an obvious way to decode audio - just look at the top of the transcribe method: faster-whisper/faster_whisper/transcribe.py Lines 824 to 834 in 9e657b4
All you need to do is: from faster_whisper.audio import decode_audio
audiofile = "audio.mp3" # can be BytesIO (binary file object)
audio = decode_audio(audio, sampling_rate=model.feature_extractor.sampling_rate)
language, language_probability, all_language_probs = model.detect_language(audio)
print(f"Detected language: {language}") Note: this considers only the first 30 second segment of the audio. Do model.detect_language(audio,language_detection_segments=SEGMENTS_NUM) to specify amount. |
Is it possible to detect the language in the audio file and transcribe it in separate steps?
I have a fine tuned model for a specific language. I'm trying to detect the language, use the fine tuned model if the language match or the general model otherwise.
openai/whisper has an example how to do it in the README but I couldn't find equivalent in faster-whisper:
The text was updated successfully, but these errors were encountered: