Replies: 3 comments 2 replies
-
These projects (below) are all using a variation of the alignment approach. I think the first one is the one that does exactly what you need: https://github.com/EtienneAb3d/WhisperTimeSync |
Beta Was this translation helpful? Give feedback.
-
Thanks. I know that protects but them have some inconveniences for me. I'm looking for the way of doing that with this project. Thanks again |
Beta Was this translation helpful? Give feedback.
-
Thank you @RaulKite for your loyalty :) Indeed in theory it's possible to use the same approach as whisper-timestamped (i.e. Whisper models with their cross-attention weights) to align a given transcription of an audio (even if that transcription was not produced by whisper) It requires to reorganize a bit the code, which is not a big deal. So @RaulKite, what transcription format would you give for an audio? EDIT: whisper transcription also include the (detected?) language. Is it an information that you would like to provide or want it to be automatic? |
Beta Was this translation helpful? Give feedback.
-
Hi,
I know that there is a method to just align words when I have an accurate transcription of the audio.
Even, I'm quite sure that I have seen anywhere the way to do that with Python but I'm not able to found it again.
Can someone point me the way to do that?
Thanks
Beta Was this translation helpful? Give feedback.
All reactions