Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offline, privacy-respecting speech to text #22

Open
RustoMCSpit opened this issue Nov 23, 2024 · 1 comment
Open

Offline, privacy-respecting speech to text #22

RustoMCSpit opened this issue Nov 23, 2024 · 1 comment

Comments

@RustoMCSpit
Copy link

RustoMCSpit commented Nov 23, 2024

Feature description

Speech-to-text transcription of audios that recognises multiple speakers. Able to see text of any audio by dropdown, or search bar, and exporting of all trascribed text as well.

Why do you want this feature?

would also be able to allow for a transcript so you could have a search bar and go through your voice recordings and you could click through the exact moment that word was said in the voice recordings. so if i typed 'adam' it may find 4 hits from the past 4 months:
file191: 00:07
file179: 12:23, 16:30
file73: 06:42

you could then click on those moments to find the one youre looking for.

this could also be used for tagging, for example, if im working on a project called 'block runner' i could search for all mentions and tag them all easily

Additional information

Futo has partially delivered on this with an excellent FOSS solution:
https://gitlab.futo.org/alex/voiceinput
https://voiceinput.futo.org/

But the Futo solution currently works within other apps only and is not integrated directly into a voice recorder app. Adding Futo's speech-to-text capabilities to Simple Voice Recorder would make a voice recorded easily on par with Google's proprietary app.

FossifyOrg/Voice-Recorder#34

@RustoMCSpit
Copy link
Author

RustoMCSpit commented Nov 24, 2024

you should be able to see the transcript underneath the waveform and see it move along with it as the recording goes on, clicking on it would bring you to the full transcript which you can copy paste. the text should highlight the current word.

you could pair this with pitch detection and then have it do midi exporting and notation transcription with words attached

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant