Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster-whisper vs. whisperX #1242

Open
sijitang opened this issue Feb 9, 2025 · 1 comment
Open

faster-whisper vs. whisperX #1242

sijitang opened this issue Feb 9, 2025 · 1 comment

Comments

@sijitang
Copy link

sijitang commented Feb 9, 2025

Hi,

I used both whisperX and fasterwhisper to transcribe the same audio, and the two resulting subtitles have the following differences:

WhisperX’s subtitles miss some parts of the content, but the timeline alignment is relatively good.
The subtitles transcribed by fasterwhisper are almost complete in terms of content, but it feels like there are some timestamp issues—either inaccurate or too long.
Even when I use the alternative VAD method (Silero) in whisperX to transcribe the audio, it still doesn’t capture as much content as fasterwhisper.

My question is: why does this happen? Isn’t whisperX also using fasterwhisper for transcription? Why is there missing content?
Is it possible to modify some parameters in whisperX so that it achieves the same transcription completeness as fasterwhisper while retaining whisperX’s alignment capability?

Does anyone with experience in improving transcription quality have any suggestions that could help me out?

Thanks

@heimoshuiyu
Copy link
Contributor

Please provide the parameters you used for whisperX and faster-whisper, preferably with an audio file and reproducible steps. Otherwise, I can only guess that this is related to word level timestamp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants