Skip to content

Commit

Permalink
Readme
Browse files Browse the repository at this point in the history
  • Loading branch information
ndrean committed Jan 16, 2024
1 parent a0ddde9 commit c8d7473
Show file tree
Hide file tree
Showing 8 changed files with 138 additions and 31 deletions.
76 changes: 71 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3210,6 +3210,9 @@ We firstly capture the audio and upload it to the server.
Source: <https://dockyard.com/blog/2023/03/07/audio-speech-recognition-in-elixir-with-whisper-bumblebee?utm_source=elixir-merge>

We use a form to capture the audio and use the MediaRecorder API. The Javascript code is triggered by an attached hook _Audio_ declared in the HTML. We use a `live_file_input` and will append the code server side.
We also let the user listen to his audio by adding an embedded audio element `<audio>` in the HTML. Its source is the audio blob as an URL object.

We also add a spinner to display that the transcription process is running, in the same way as we did for the captioning process. We introduce a Phoenix component to avoid code duplication.

```html
# /lib/app_web/live/page_live.html.heex
Expand All @@ -3233,9 +3236,16 @@ We use a form to capture the audio and use the MediaRecorder API. The Javascript
</p>
<audio id="audio" controls></audio>
<AppWeb.Spinner.spin spin="{@speech_spin}" />
<p id="output"><%= @transcription %></p>
```

where we used the component:
The Spinner component takes a socket attribute. You can also use it to display the spinner when the captioning task is running, with:

```elixir
<AppWeb.Spinner.spin spin={@running?} />
```

The component is:

```elixir
# /lib/app_web/components/spinner.ex
Expand Down Expand Up @@ -3447,8 +3457,51 @@ We will add the Elixir binding `HNSWLib`:
#### Transformer model and HNSWLib Index setup

We will encode every caption as a vector with the appropriate serving that runs the "sentence-transformers/paraphrase-MiniLM-L6-v2" model.
This will be done in a GenServer since we also want to load the embedding model. We endow the vector space with a _cosine_ pseudo-metric. We also load the model.
In the GenServer, we will also instantiate the HNSWLib Index struct which is saved into a file. When the app starts, we whether read the existing file or create a new one.
The model will be loaded via a GenServer.
We instantiate the Index struct via a file needed by HNSWLib in another GenServer. We endow the vector space with a _cosine_ pseudo-metric. When the app starts, we whether read the existing file or create a new one.

```elixir
defmodule App.KnnIndex do
use GenServer

@indexes "indexes.bin"

def start_link(_) do
GenServer.start_link(__MODULE__, {}, name: __MODULE__)
end

def init(_) do
upload_dir = Application.app_dir(:app, ["priv", "static", "uploads"])
File.mkdir_p!(upload_dir)

path = Path.join([upload_dir, @indexes])
space = :cosine

require Logger

{:ok, index} =
case File.exists?(path) do
false ->
Logger.info("New Index")
HNSWLib.Index.new(_space = space, _dim = 384, _max_elements = 200)

true ->
Logger.info("Existing Index")
HNSWLib.Index.load_index(space, 384, path)
end

{:ok, index}
end

def load_index do
GenServer.call(__MODULE__, :load)
end

def handle_call(:load, _from, state) do
{:reply, state, state}
end
end
```

```elixir
# /lib/app/text_embedding.ex
Expand Down Expand Up @@ -3516,6 +3569,8 @@ The GenServer is started in the Application module.
children = [
...,
App.TextEmbedding,
App.KnnIndex,
...
]
```

Expand All @@ -3525,11 +3580,13 @@ We firstly append the Liveview socket with the index and the serving of the tran

```elixir
def mount(_, _, socket) do
{serving, index} = App.TextEmbedding.serve()
serving = App.TextEmbedding.serve()
index = App.KnnIndex.load_index()
...
{:ok,
socket
|> assign(
...,
serve_embedding: serving,
index: index,
db_img: nil,
Expand Down Expand Up @@ -3651,7 +3708,6 @@ def handle_knn(index, input) do

case HNSWLib.Index.knn_query(index, input, k: 1) do
{:ok, label, distance} ->
dbg(distance)

label[0]
|> Nx.to_flat_list()
Expand All @@ -3665,6 +3721,16 @@ end

```

We finally display the found image since we got an `App.Image` struct back from the database through the index:

```elixir
# /lib/app_web/live/page_live.html.heex

<div :if={@search_result}>
<img src={@search_result.url} alt="found_image" />
</div>
```

#### Alter schema

We will save the index found We will add a column to the `:images` table. We run a Mix task to generate a timestamped file:
Expand Down
4 changes: 2 additions & 2 deletions assets/js/micro.js
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
export default {
mounted() {
let mediaRecorder,
audioChunks = [];
let mediaRecorder;
let audioChunks = [];
const recordButton = document.getElementById("record"),
audioElement = document.getElementById("audio"),
text = document.getElementById("text"),
Expand Down
1 change: 1 addition & 0 deletions lib/app/application.ex
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ defmodule App.Application do
{Phoenix.PubSub, name: App.PubSub},
# Nx serving for the embedding
App.TextEmbedding,
App.KnnIndex,
# Nx serving for Speech-to-Text
{Nx.Serving, serving: App.Whisper.serving(), name: Whisper},
# Nx serving for image classifier
Expand Down
40 changes: 40 additions & 0 deletions lib/app/knn_index.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
defmodule App.KnnIndex do
use GenServer

@indexes "indexes.bin"

def start_link(_) do
GenServer.start_link(__MODULE__, {}, name: __MODULE__)
end

def init(_) do
upload_dir = Application.app_dir(:app, ["priv", "static", "uploads"])
File.mkdir_p!(upload_dir)

path = Path.join([upload_dir, @indexes])
space = :cosine

require Logger

{:ok, index} =
case File.exists?(path) do
false ->
Logger.info("New Index")
HNSWLib.Index.new(_space = space, _dim = 384, _max_elements = 200)

true ->
Logger.info("Existing Index")
HNSWLib.Index.load_index(space, 384, path)
end

{:ok, index}
end

def load_index do
GenServer.call(__MODULE__, :load)
end

def handle_call(:load, _from, state) do
{:reply, state, state}
end
end
40 changes: 20 additions & 20 deletions lib/app/text_embedding.ex
Original file line number Diff line number Diff line change
@@ -1,38 +1,38 @@
defmodule App.TextEmbedding do
use GenServer
@indexes "indexes.bin"
# @indexes "indexes.bin"

def start_link(_) do
GenServer.start_link(__MODULE__, {}, name: __MODULE__)
end

# upload or create a new index file
def init(_) do
upload_dir = Application.app_dir(:app, ["priv", "static", "uploads"])
File.mkdir_p!(upload_dir)
# upload_dir = Application.app_dir(:app, ["priv", "static", "uploads"])
# File.mkdir_p!(upload_dir)

path = Path.join([upload_dir, @indexes])
space = :cosine
# path = Path.join([upload_dir, @indexes])
# space = :cosine

require Logger
# require Logger

{:ok, index} =
case File.exists?(path) do
false ->
Logger.info("New Index")
HNSWLib.Index.new(_space = space, _dim = 384, _max_elements = 200)
# {:ok, index} =
# case File.exists?(path) do
# false ->
# Logger.info("New Index")
# HNSWLib.Index.new(_space = space, _dim = 384, _max_elements = 200)

true ->
Logger.info("Existing Index")
HNSWLib.Index.load_index(space, 384, path)
end
# true ->
# Logger.info("Existing Index")
# HNSWLib.Index.load_index(space, 384, path)
# end

model_info = nil
tokenizer = nil
{:ok, {model_info, tokenizer, index}, {:continue, :load}}
{:ok, {model_info, tokenizer}, {:continue, :load}}
end

def handle_continue(:load, {_, _, index}) do
def handle_continue(:load, {_, _}) do
transformer = "sentence-transformers/paraphrase-MiniLM-L6-v2"

{:ok, %{model: _model, params: _params} = model_info} =
Expand All @@ -43,17 +43,17 @@ defmodule App.TextEmbedding do

require Logger
Logger.info("Transformer loaded")
{:noreply, {model_info, tokenizer, index}}
{:noreply, {model_info, tokenizer}}
end

# called in Liveview `mount`
def serve() do
GenServer.call(__MODULE__, :serve)
end

def handle_call(:serve, _from, {model_info, tokenizer, index} = state) do
def handle_call(:serve, _from, {model_info, tokenizer} = state) do
serving = Bumblebee.Text.TextEmbedding.text_embedding(model_info, tokenizer)

{:reply, {serving, index}, state}
{:reply, serving, state}
end
end
5 changes: 3 additions & 2 deletions lib/app_web/live/page_live.ex
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,9 @@ defmodule AppWeb.PageLive do

@impl true
def mount(_params, _session, socket) do
# File.mkdir_p!(@upload_dir)
{serving, index} = App.TextEmbedding.serve()
# load the embedding and the Index
serving = App.TextEmbedding.serve()
index = App.KnnIndex.load_index()

{:ok,
socket
Expand Down
3 changes: 1 addition & 2 deletions lib/app_web/live/page_live.html.heex
Original file line number Diff line number Diff line change
Expand Up @@ -170,10 +170,9 @@
outline
class="w-6 h-6 text-white font-bold group-active:animate-pulse"
/>
<span>Record</span>
<span id="text">Record</span>
</button>
<audio id="audio" controls></audio>
<%= @speech_spin %>
<AppWeb.Spinner.spin spin={@speech_spin} />
</p>

Expand Down
Binary file modified priv/static/uploads/indexes.bin
Binary file not shown.

0 comments on commit c8d7473

Please sign in to comment.