-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
111 lines (76 loc) · 3.3 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
---
output: github_document
---
```{r, include=FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
# transcribe <img src="man/figures/logo.png" align="right" height="139"/>
[](https://github.com/brancengregory/transcribe/actions)
The **transcribe** package provides an R interface to audio transcription using Whisper, with optional post‑processing via Ollama. It includes a command‑line interface (CLI) and a Plumber API to create a web interface.
## Prerequisites
- **Whisper**:
Make sure you have [OpenAI Whisper](https://github.com/openai/whisper) installed and configured. Follow the official documentation for installation instructions.
- **Ollama**:
Ensure that [Ollama](https://ollama.com) is installed and running. Consult the Ollama documentation for setup details and configuration.
- **Other Dependencies**:
This package uses:
- **processx** to wrap the `yt-dlp` command for downloading audio files.
- **reticulate** to call Python’s Whisper implementation.
- **ellmer** for prompt-based post‑processing of raw Whisper transcripts.
Please refer to each package's documentation for further details.
## Overview
The package workflow is as follows:
- **Audio Downloading**:
When given a remote URL, **processx** is used to call `yt-dlp`, downloading the audio file quickly and robustly.
- **Transcription via Whisper**:
Python’s Whisper is called via **reticulate**, providing state‑of‑the‑art transcription directly from R.
- **Post‑processing with Ollama and ellmer**:
The raw transcript from Whisper is optionally sent to Ollama via **ellmer** using a prompt (e.g., “Reformat the transcript into clear, well‑punctuated paragraphs…”) to produce a cleaned, readable transcript.
- **Interfaces**:
Use the CLI for batch processing, or launch the Plumber API to access a web interface.
## Installation
```{r, eval=FALSE}
# install.packages("remotes") # if not already installed
remotes::install_github("brancengregory/transcribe")
```
## Usage
### Basic Example
```{r, eval=FALSE}
library(transcribe)
# Transcribe a local audio file.
transcript <- transcribe_audio(
input_path = "path/to/audio.wav",
language = "en",
whisper_model_name = "large-v3-turbo",
processed = TRUE,
ollama_model = "llama3.2"
)
cat(transcript)
```
### CLI Usage
To use the command‑line interface, run the following command from your terminal:
```bash
Rscript inst/scripts/main_cli.R -i "path/to/audio.wav" -l en -m large-v3-turbo -p TRUE -M llama3.2 -o "transcribe.txt"
```
This command processes the specified audio file and saves the transcript to `transcribe.txt`.
### Plumber API
You can serve a web interface via **plumber**. For example:
```{r, eval=FALSE}
library(plumber)
plumber::plumb("inst/plumber/api.R")$run(port = 7608)
```
Then open your browser at [http://127.0.0.1:7608](http://127.0.0.1:7608) to use the transcription interface.
## Vignettes
For a detailed introduction, see the vignette “intro”:
```{r, eval=FALSE}
vignette("intro", package = "transcribe")
```
## Contributing
1. Fork the repository on GitHub.
2. Create a new branch for your changes.
3. Submit a pull request describing your proposed changes.
## License
MIT © 2025 Your Name