How to transcribe audio in Indian English #1246

ranjitsingha · 2025-02-14T05:07:25Z

Hey there i want to know if is it possible to convert an audio who's source language is hindi an then transcribe to Indian English also known as Hinglish.

When i use model large-v3-turbo with language

model.transcribe(audio='audio.mp3', language="hi", word_timestamps=True)

I get output as:

00:00:00,000 --> 00:00:00,320
पता
00:00:00,320 --> 00:00:00,440
है
00:00:00,440 --> 00:00:01,080
सबसे
00:00:01,080 --> 00:00:01,540
डरावनी

I want it as Indian English (Example) :

00:00:00,000 --> 00:00:00,320
Pata
00:00:00,320 --> 00:00:00,440
hai
00:00:00,440 --> 00:00:01,080
sabse
00:00:01,080 --> 00:00:01,540
darawani

But if i set the language to "en" i get output as:

00:00:00,000 --> 00:00:00,320
Do
00:00:00,320 --> 00:00:00,440
you
00:00:00,440 --> 00:00:01,080
know
00:00:01,080 --> 00:00:01,540
that

The text was updated successfully, but these errors were encountered:

emcodem · 2025-03-06T20:37:26Z

No, "Hinglish" is not a supported language in the whisper models.
You can try to set lang to hindi and work with a perpetual hinglish prompt (e.g. 2 short sentences that always stay on the left side in prompt), but i doubt that the results will be stable/satisfying.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to transcribe audio in Indian English #1246

How to transcribe audio in Indian English #1246

ranjitsingha commented Feb 14, 2025 •

edited

Loading

emcodem commented Mar 6, 2025 •

edited

Loading

How to transcribe audio in Indian English #1246

How to transcribe audio in Indian English #1246

Comments

ranjitsingha commented Feb 14, 2025 • edited Loading

emcodem commented Mar 6, 2025 • edited Loading

ranjitsingha commented Feb 14, 2025 •

edited

Loading

emcodem commented Mar 6, 2025 •

edited

Loading