Support of the json files made with Whisper, for a better workflow #138

neopiccolorat · 2025-01-19T03:12:17Z

The JSON files produced by Whisperx are very similar in structure to those created by vosk, which are already supported by VideoGrep. Both formats include comparable fields, such as text instead of content. The start and end timestamps are the same.

It would be fantastic to have videogrep support the JSON files generated by Whisperx, given these similarities.

Here’s a snippet of the JSON output from Vosk for reference:

In short, while videogrep can't currently process JSON files from Whisperx, the differences between these files and the vosk-supported JSON files are minimal. Adding support for Whisperx JSON could enhance compatibility significantly.

[
	{
		"content": "Hi",
		"start": 1.32,
		"end": 1.56,
		"words": [
			{
				"conf": 1.0,
				"end": 1.56,
				"start": 1.32,
				"word": "Hi"
			}
		]
	},
	{
		"content": "No thanks",
		"start": 2.46,
		"end": 3.63,
		"words": [
			{
				"conf": 1.0,
				"end": 3.12,
				"start": 2.46,
				"word": "No"
			},
			{
				"conf": 1.0,
				"end": 3.63,
				"start": 3.12,
				"word": "Thanks"
			}
		]
	},

And this is what's coming out of whisperx :

{
    "segments": [
        {
            "start": 1.688,
            "end": 2.008,
            "text": "You're here?",
            "words": [
                {
                    "word": "You're",
                    "start": 1.688,
                    "end": 1.828,
                    "score": 0.34,
                },
                {
                    "word": "here?",
                    "start": 1.888,
                    "end": 2.008,
                    "score": 0.907,
                }
            ],

The thing is, I have a python script to convert these to be recognized by videogrep. It's a workable workflow on a file to file basis, but when you have dozens and dozens of new transcription constantly coming out, and not having duplicates (and sometimes not compatible with other tools), it would be easier and more practical to have videgrep internally support the files coming out of Whisper

I totally get that this is probably a passion project (I'm guessing), but I just want to say how much more useful this little gem is beyond just pulling out fun specific sentences for fun. I’ve used it to isolate speakers in really long interviews, and even though I had to tweak the XML files and the srt/json transcriptions files to get them to work better with videogrep, I saved hours of manual work.

There's a pretty big audience out there for these awesome tools!

I honestly couldn't believe this existed when I found it last year.

Keep up the amazing work, and thanks a ton for this gem!

The text was updated successfully, but these errors were encountered:

neopiccolorat changed the title ~~Support of the json files made with Whisper~~ Support of the json files made with Whisper, for a better workflow Jan 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support of the json files made with Whisper, for a better workflow #138

Support of the json files made with Whisper, for a better workflow #138

neopiccolorat commented Jan 19, 2025 •

edited

Loading

Support of the json files made with Whisper, for a better workflow #138

Support of the json files made with Whisper, for a better workflow #138

Comments

neopiccolorat commented Jan 19, 2025 • edited Loading

neopiccolorat commented Jan 19, 2025 •

edited

Loading