A Python-based tool for generating parallel subtitle videos for opera performances, with synchronized original language and translated text.
This is the repo behind the YouTube channel of the same name!
- Automatic audio transcription using OpenAI's Whisper model
- Vocal separation using Demucs
- Text alignment between transcribed audio and libretto
- Parallel subtitle video generation with original and translated text
- Support for multiple languages
- Configurable video output settings
- Python 3.8+
- FFmpeg
- yt-dlp
- OpenAI API key
- Clone the repository:
git clone <repository-url>
cd kunstwerk
- Install dependencies:
pip install -r requirements.txt
- Set up your OpenAI API key:
export OPENAI_API_KEY='your-api-key'
Example output: Tristan und Isolde with parallel subtitles
- Create a YAML configuration file for your opera (see example configs):
title: TRISTAN UND ISOLDE
file_prefix: tristan
language: de
start_idx: 1
end_idx: 33
overture_indices: [1]
secondary_color: Silver
video_width: 3840
video_height: 2160
font_size: 96
res_divisor: 1
playlist_url: https://www.youtube.com/playlist?list=EXAMPLE
characters:
- Tristan
- Isolde
# Add other characters...
- Process the opera:
python kunstwerk.py configs/your_config.yaml
You can also skip certain steps if you've already completed them:
# Skip download/separation if you already have the audio files:
python kunstwerk.py configs/your_config.yaml --skip-download
# Skip transcription if you already have the transcriptions:
python kunstwerk.py configs/your_config.yaml --skip-transcribe
# Skip both download and transcription:
python kunstwerk.py configs/your_config.yaml --skip-download --skip-transcribe
title
: Opera title displayed in the videofile_prefix
: Prefix for generated fileslanguage
: Source language code (e.g., 'de' for German)start_idx
/end_idx
: Range of scenes to processoverture_indices
: List of instrumental sections to skipsecondary_color
: Color for translated textvideo_width
/video_height
: Output video dimensionsfont_size
: Base font sizeres_divisor
: Resolution scaling factorplaylist_url
: YouTube playlist URL for downloadingcharacters
: List of character names for formatting
separate.sh
: Downloads and separates audiotranscribe.py
: Handles audio transcriptionmake_video.py
: Generates the final videoalign.py
: Aligns transcribed text with librettoconfig_parser.py
: Parses YAML configurationvideo_gen/
: Video generation modulesconfig/
: Configuration classesframe/
: Frame generationtext/
: Text formattingvideo/
: Video creation