[Feature request] - Integrate Voicetyping #115

Ulf3000 · 2025-01-31T21:50:12Z

Given that your tool is exceptionally well-programmed and functions seamlessly across various applications, it would be beneficial to incorporate voice-typing processing through Gemini or other suitable LLMs.
It would be fantastic to have a dedicated button within your app, rather than relying on inferior voice-typing solutions. Perhaps my perspective is incorrect; if so, please correct me.

theJayTea · 2025-02-01T05:33:59Z

This is an interesting request that we could think about adding in the future.

There's actually a very nice dedicated model for this by OpenAI called Whisper, however, running it locally requires ~4 GB of vram/ram and almost everyone certainly wouldn't be able to run it alongside a local LLM. There's actually 1 project I found that does what you requested with this:
https://github.com/savbell/whisper-writer

A way to get free & accessible state of the art transcription would be using the Gemini API and asking the multimodal Gemini 2.0 for a transcript. However, I'm unsure what the latency would be like.

This is not something I can immediately work on, and I'd also like to hear what others think about this proposal first.

theJayTea added enhancement New feature or request discussion labels Feb 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] - Integrate Voicetyping #115

[Feature request] - Integrate Voicetyping #115

Ulf3000 commented Jan 31, 2025

theJayTea commented Feb 1, 2025

[Feature request] - Integrate Voicetyping #115

[Feature request] - Integrate Voicetyping #115

Comments

Ulf3000 commented Jan 31, 2025

theJayTea commented Feb 1, 2025