Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support of the json files made with Whisper, for a better workflow #138

Open
neopiccolorat opened this issue Jan 19, 2025 · 0 comments
Open

Comments

@neopiccolorat
Copy link

neopiccolorat commented Jan 19, 2025

The JSON files produced by Whisperx are very similar in structure to those created by vosk, which are already supported by VideoGrep. Both formats include comparable fields, such as text instead of content. The start and end timestamps are the same.

It would be fantastic to have videogrep support the JSON files generated by Whisperx, given these similarities.

Here’s a snippet of the JSON output from Vosk for reference:

In short, while videogrep can't currently process JSON files from Whisperx, the differences between these files and the vosk-supported JSON files are minimal. Adding support for Whisperx JSON could enhance compatibility significantly.

[
	{
		"content": "Hi",
		"start": 1.32,
		"end": 1.56,
		"words": [
			{
				"conf": 1.0,
				"end": 1.56,
				"start": 1.32,
				"word": "Hi"
			}
		]
	},
	{
		"content": "No thanks",
		"start": 2.46,
		"end": 3.63,
		"words": [
			{
				"conf": 1.0,
				"end": 3.12,
				"start": 2.46,
				"word": "No"
			},
			{
				"conf": 1.0,
				"end": 3.63,
				"start": 3.12,
				"word": "Thanks"
			}
		]
	},

And this is what's coming out of whisperx :

{
    "segments": [
        {
            "start": 1.688,
            "end": 2.008,
            "text": "You're here?",
            "words": [
                {
                    "word": "You're",
                    "start": 1.688,
                    "end": 1.828,
                    "score": 0.34,
                },
                {
                    "word": "here?",
                    "start": 1.888,
                    "end": 2.008,
                    "score": 0.907,
                }
            ],

The thing is, I have a python script to convert these to be recognized by videogrep. It's a workable workflow on a file to file basis, but when you have dozens and dozens of new transcription constantly coming out, and not having duplicates (and sometimes not compatible with other tools), it would be easier and more practical to have videgrep internally support the files coming out of Whisper

I totally get that this is probably a passion project (I'm guessing), but I just want to say how much more useful this little gem is beyond just pulling out fun specific sentences for fun. I’ve used it to isolate speakers in really long interviews, and even though I had to tweak the XML files and the srt/json transcriptions files to get them to work better with videogrep, I saved hours of manual work.

There's a pretty big audience out there for these awesome tools!

I honestly couldn't believe this existed when I found it last year.

Keep up the amazing work, and thanks a ton for this gem!

@neopiccolorat neopiccolorat changed the title Support of the json files made with Whisper Support of the json files made with Whisper, for a better workflow Jan 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant