- python 3.8, pip
- ffmpeg
- download and extract voicevox_engine https://github.com/VOICEVOX/voicevox_engine/releases
- set environment variable OPENAI_KEY
Environment and dependencies
python3 -m venv venv
./venv/bin/activate
pip install -r requirements.txt
Api doc generate
python make_docs.py
Run local server
python run.py --voicevox_dir=voicevox_engine_folder
Api docs
We have 2 endpoints
http://localhost:50021/voice_voice?speaker=1&whisper=true http://localhost:50021/text_voice?speaker=1&text=すぐ修正できそうですか?
whisper=true => use whisper API whisper=false => use local speech to text model (slower)
API accepts .wav file
docker build .
docker run -p 50021:50021 -e OPENAI_KEY=xxx container_id
Define connection url in database/setting.py
SQLALCHEMY_DATABASE_URL = "sqlite:///./sql_app.db"
Run with mock engine
python run.py --enable_mock
Sign up and Login to get Authorization token
Then, copy access_token and parse it to Headers when call 2 endpoints voice_voice and text_voice