Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weekly Report 2024-11-01 #1

Open
Jim-Hutchinson opened this issue Oct 31, 2024 · 2 comments
Open

Weekly Report 2024-11-01 #1

Jim-Hutchinson opened this issue Oct 31, 2024 · 2 comments

Comments

@Jim-Hutchinson
Copy link
Collaborator

Removed popup message functionality as it is not helpful for a visually impaired user and cluttered the rest of the code.

Began implementation of Piper TTS for AI voice synthesis instead. The model is used by NVDA, and was presented for use in a very similar application to my own, which is how I found it. Image captioning for the visually impaired

Support for AMD hardware, which I use, is not currently in the main version of Piper TTS. However, a fork does exist, so I will be proceeding with that version.

@Jim-Hutchinson
Copy link
Collaborator Author

I'm having issues installing Piper TTS. I may need to change python versions to 3.10.12. If that does not work, I will find a different test to speech model to use instead.

@Jim-Hutchinson
Copy link
Collaborator Author

I ended up setting up Silero TTS with the v3_en model from here. The model offers multiple speakers and is able to run on my hardware with reasonable speed, even on my laptop. When hosted on a more powerful machine, I hope to make the speech near real-time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant