Replies: 1 comment
-
Whisper is an acoustic model and a language model in once. Like any language model, you can give Whisper a prompt, that is not just "" special token(s), but a (small) text, which can typically correspond to what has been said previously to the currently processed audio chunk. Then, when OpenAI Whisper transcribes an audio of more than 30 sec, it takes what was described in the first ~30 sec as a prompt for the next 30 sec of audio (and the initial prompt is no more taken into consideration). I hope that it clarifies. Do not hesitate to ask more clarification. |
Beta Was this translation helpful? Give feedback.
-
When using
transcribe
function, what is the meaning of theinitial_prompt
parameter ?I read the documentation (
"optional text to provide as a prompt for the first window.
) and still not sure I understand the meaning.Beta Was this translation helpful? Give feedback.
All reactions