diff --git a/README.md b/README.md index 0f48f733..953ab929 100644 --- a/README.md +++ b/README.md @@ -7,13 +7,13 @@ Try it out on the [HuggingFace Space](https://huggingface.co/spaces/speaches-ai/speaches) -See the documentation for installation instructions and usage: [https://speaches-ai.github.io/speaches/](https://speaches-ai.github.io/speaches/) +See the documentation for installation instructions and usage: [speaches.ai](https://speaches.ai/) ## Features: - GPU and CPU support. -- [Deployable via Docker Compose / Docker](https://speaches-ai.github.io/speaches/installation/) -- [Highly configurable](https://speaches-ai.github.io/speaches/configuration/) +- [Deployable via Docker Compose / Docker](https://speaches.ai/installation/) +- [Highly configurable](https://speaches.ai/configuration/) - OpenAI API compatible. All tools and SDKs that work with OpenAI's API should work with `speaches`. - Streaming support (transcription is sent via SSE as the audio is transcribed. You don't need to wait for the audio to fully be transcribed before receiving it). @@ -40,7 +40,6 @@ TODO https://github.com/user-attachments/assets/0021acd9-f480-4bc3-904d-831f54c4d45b - ### Live Transcription (using WebSockets) https://github.com/fedirz/faster-whisper-server/assets/76551385/e334c124-af61-41d4-839c-874be150598f diff --git a/docs/index.md b/docs/index.md index 022a0e32..16879c2f 100644 --- a/docs/index.md +++ b/docs/index.md @@ -23,8 +23,8 @@ - Dynamic model loading / offloading. Just specify which model you want to use in the request and it will be loaded automatically. It will then be unloaded after a period of inactivity. - Text-to-Speech via `kokoro`(Ranked #1 in the [TTS Arena](https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena)) and `piper` models. - GPU and CPU support. -- [Deployable via Docker Compose / Docker](https://speaches-ai.github.io/speaches/installation/) -- [Highly configurable](https://speaches-ai.github.io/speaches/configuration/) +- [Deployable via Docker Compose / Docker](https://speaches.ai/installation/) +- [Highly configurable](https://speaches.ai/configuration/) - [Coming soon](https://github.com/speaches-ai/speaches/issues/115): Realtime API | [OpenAI Documentation](https://platform.openai.com/docs/guides/realtime) Please create an issue if you find a bug, have a question, or a feature suggestion. diff --git a/docs/usage/voice-chat.md b/docs/usage/voice-chat.md index 19947ab7..ff4975a6 100644 --- a/docs/usage/voice-chat.md +++ b/docs/usage/voice-chat.md @@ -1,14 +1,14 @@ !!! note - Before proceeding, you should be familiar with [OpenAI Audio Generation Guide](https://platform.openai.com/docs/guides/audio). The guide explains how the API works and provides examples on how to use. Unless stated otherwise in [Limitations](#limitations) if a feature is supported by OpenAI, it should be supported by this project as well. + Before proceeding, you should be familiar with [OpenAI Audio Generation Guide](https://platform.openai.com/docs/guides/audio). The guide explains how the API works and provides examples on how to use. Unless stated otherwise in [limitations](#limitations) if a feature is supported by OpenAI, it should be supported by this project as well. ## Prerequisites Follow the prerequisites in the [Text-to-Speech](./text-to-speech.md) guide. And set the following environmental variables: -- `CHAT_COMPLETION_BASE_URL` to the base url of an OpenAI API compatible endpoint | [Config](../configuration.md) -- `CHAT_COMPLETION_MODEL` to the name of the model you'd like to use. | [Config](../configuration.md) -- `CHAT_COMPLETION_API_KEY` if the API you are using requires authentication | [Config](../configuration.md) +- `CHAT_COMPLETION_BASE_URL` to the base URL of an OpenAI API compatible endpoint | [Config](../configuration.md#speaches.config.Config.chat_completion_base_url) +- `CHAT_COMPLETION_MODEL` to the name of the model you'd like to use. | [Config](../configuration.md#speaches.config.Config.chat_completion_model) +- `CHAT_COMPLETION_API_KEY` if the API you are using requires authentication | [Config](../configuration.md#speaches.config.Config.chat_completion_api_key) Ollama example: @@ -143,4 +143,4 @@ openai_client.chat.completions.create( - User's input audio message are not cached. That means the user's input audio message will be transcribed each time it sent. This can be a performance issue when doing long multi-turn conversations. - Multiple choices (`n` > 1) are not supported -This features utilizes [./text-to-speech.md](Text-to-Speech) and [./speech-to-text.md](Speech-to-Text) features. Therefore, the limitations of those features apply here as well. +This features utilizes [Text-to-Speech](./text-to-speech.md) and [Speech-to-Text](./speech-to-text.md) features. Therefore, the limitations of those features apply here as well. diff --git a/src/speaches/ui/app.py b/src/speaches/ui/app.py index 2fb35bfb..dd96ec12 100644 --- a/src/speaches/ui/app.py +++ b/src/speaches/ui/app.py @@ -14,9 +14,9 @@ def create_gradio_demo(config: Config) -> gr.Blocks: gr.Markdown( "### Consider supporting the project by starring the [speaches-ai/speaches repository on GitHub](https://github.com/speaches-ai/speaches)." ) - gr.Markdown("### Documentation Website: https://speaches-ai.github.io/speaches") + gr.Markdown("### Documentation Website: https://speaches.ai") gr.Markdown( - "### For additional details regarding the parameters, see the [API Documentation](https://speaches-ai.github.io/speaches/api)" + "### For additional details regarding the parameters, see the [API Documentation](https://speaches.ai/api)" ) create_audio_chat_tab(config)