Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snowboy hot word Detection #550

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 33 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
jasper-client
jasper-client - Snowboy Hot Word Detection
=============

[![Build Status](https://travis-ci.org/jasperproject/jasper-client.svg?branch=master)](https://travis-ci.org/jasperproject/jasper-client) [![Coverage Status](https://img.shields.io/coveralls/jasperproject/jasper-client.svg)](https://coveralls.io/r/jasperproject/jasper-client) [![Codacy Badge](https://www.codacy.com/project/badge/3a50e1bc2261419894d76b7e2c1ac694)](https://www.codacy.com/app/jasperproject/jasper-client)
Expand All @@ -7,6 +7,38 @@ Client code for the Jasper voice computing platform. Jasper is an open source pl

Learn more at [jasperproject.github.io](http://jasperproject.github.io/), where we have assembly and installation instructions, as well as extensive documentation. For the relevant disk image, please visit [SourceForge](http://sourceforge.net/projects/jasperproject/).

## The differences with the main project

The problem with stt online, it's the big lack of PriVAcy !!!

In jasper by adding stt_passive_engine in profile.yml you can choose the stt who gonna listen to you til you said "Jasper !".
Of course you have to choose a not online stt, if you mind about your PriVAcy...

That's good But !
- You can't easily choose a wake up word.
- Due to my accent (I think), when i said "Jasper"... doesn't work all the time...

To address these points, i decide to use "Snowboy Hot Word Detection" [https://snowboy.kitt.ai/](https://snowboy.kitt.ai/) as "stt passive engine".
- Define and train your own hotword
- High accuracy,
- Low latency and no internet needed
- Small memory footprint and cross-platform support

## Installation

- Follow the python installation -> [https://github.com/Kitt-AI/snowboy](https://github.com/Kitt-AI/snowboy)
- copy _snowboydetect.so (generated in project snowboy/swig/Python) in jasper/client/snowboy
- create your model -> [https://snowboy.kitt.ai/](https://snowboy.kitt.ai/)
- copy your model under the name "model.pmdl" in /jasper/client/snowboy)
- And here we go !!!

## TODO

- Remove "Persona" in the code
- Remove all the dependance in the code stt_passive_engine
- Too much latence in active listenning.
- Re add gmail notification

## Contributing

If you'd like to contribute to Jasper, please read through our **[Contributing Guide](CONTRIBUTING.md)**, which outlines the philosophies to preserve, tests to run, and more. We highly recommend reading through this guide before writing any code.
Expand Down
69 changes: 38 additions & 31 deletions client/conversation.py
Original file line number Diff line number Diff line change
@@ -1,49 +1,56 @@
# -*- coding: utf-8-*-
import logging
import signal
import os
from notifier import Notifier
from brain import Brain

from snowboy import snowboydecoder

class Conversation(object):

def __init__(self, persona, mic, profile):
def __init__(self, mic, profile):
self._logger = logging.getLogger(__name__)
self.persona = persona
self.mic = mic
self.profile = profile
self.brain = Brain(mic, profile)
self.notifier = Notifier(profile)
self.interrupted = False

def signal_handler(self, signal, frame):
self.interrupted = True

def interrupt_callback(self):
return self.interrupted

def startListenningActively(self):
threshold = None
self._logger.debug("Started to listen actively with threshold: %r",
threshold)
input = self.mic.activeListenToAllOptions(threshold)
self._logger.debug("Stopped to listen actively with threshold: %r",
threshold)
print("i'm here now")
if input:
self.brain.query(input)
else:
self.mic.say("Pardon?")

def handleForever(self):
"""
Delegates user input to the handling function when activated.
"""
self._logger.info("Starting to handle conversation with keyword '%s'.",
self.persona)
while True:
# Print notifications until empty
notifications = self.notifier.getAllNotifications()
for notif in notifications:
self._logger.info("Received notification: '%s'", str(notif))

self._logger.debug("Started listening for keyword '%s'",
self.persona)
threshold, transcribed = self.mic.passiveListen(self.persona)
self._logger.debug("Stopped listening for keyword '%s'",
self.persona)

if not transcribed or not threshold:
self._logger.info("Nothing has been said or transcribed.")
continue
self._logger.info("Keyword '%s' has been said!", self.persona)

self._logger.debug("Started to listen actively with threshold: %r",
threshold)
input = self.mic.activeListenToAllOptions(threshold)
self._logger.debug("Stopped to listen actively with threshold: %r",
threshold)
self._logger.info("Starting to handle conversation")

TOP_DIR = os.path.dirname(os.path.abspath(__file__))
MODEL_FILE = os.path.join(TOP_DIR, "snowboy/model.pmdl")

signal.signal(signal.SIGINT, self.signal_handler)
detector = snowboydecoder.HotwordDetector(MODEL_FILE, sensitivity=0.5)
print('Listening... Press Ctrl+C to exit')

# main loop
detector.start(detected_callback=self.startListenningActively,
interrupt_check=self.interrupt_callback,
sleep_time=0.03)

if input:
self.brain.query(input)
else:
self.mic.say("Pardon?")
detector.terminate()
102 changes: 1 addition & 101 deletions client/mic.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,19 +16,16 @@ class Mic:
speechRec = None
speechRec_persona = None

def __init__(self, speaker, passive_stt_engine, active_stt_engine):
def __init__(self, speaker, active_stt_engine):
"""
Initiates the pocketsphinx instance.

Arguments:
speaker -- handles platform-independent audio output
passive_stt_engine -- performs STT while Jasper is in passive listen
mode
acive_stt_engine -- performs STT while Jasper is in active listen mode
"""
self._logger = logging.getLogger(__name__)
self.speaker = speaker
self.passive_stt_engine = passive_stt_engine
self.active_stt_engine = active_stt_engine
self._logger.info("Initializing PyAudio. ALSA/Jack error messages " +
"that pop up during this process are normal and " +
Expand Down Expand Up @@ -86,103 +83,6 @@ def fetchThreshold(self):

return THRESHOLD

def passiveListen(self, PERSONA):
"""
Listens for PERSONA in everyday sound. Times out after LISTEN_TIME, so
needs to be restarted.
"""

THRESHOLD_MULTIPLIER = 1.8
RATE = 16000
CHUNK = 1024

# number of seconds to allow to establish threshold
THRESHOLD_TIME = 1

# number of seconds to listen before forcing restart
LISTEN_TIME = 10

# prepare recording stream
stream = self._audio.open(format=pyaudio.paInt16,
channels=1,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)

# stores the audio data
frames = []

# stores the lastN score values
lastN = [i for i in range(30)]

# calculate the long run average, and thereby the proper threshold
for i in range(0, RATE / CHUNK * THRESHOLD_TIME):

data = stream.read(CHUNK)
frames.append(data)

# save this data point as a score
lastN.pop(0)
lastN.append(self.getScore(data))
average = sum(lastN) / len(lastN)

# this will be the benchmark to cause a disturbance over!
THRESHOLD = average * THRESHOLD_MULTIPLIER

# save some memory for sound data
frames = []

# flag raised when sound disturbance detected
didDetect = False

# start passively listening for disturbance above threshold
for i in range(0, RATE / CHUNK * LISTEN_TIME):

data = stream.read(CHUNK)
frames.append(data)
score = self.getScore(data)

if score > THRESHOLD:
didDetect = True
break

# no use continuing if no flag raised
if not didDetect:
print "No disturbance detected"
stream.stop_stream()
stream.close()
return (None, None)

# cutoff any recording before this disturbance was detected
frames = frames[-20:]

# otherwise, let's keep recording for few seconds and save the file
DELAY_MULTIPLIER = 1
for i in range(0, RATE / CHUNK * DELAY_MULTIPLIER):

data = stream.read(CHUNK)
frames.append(data)

# save the audio data
stream.stop_stream()
stream.close()

with tempfile.NamedTemporaryFile(mode='w+b') as f:
wav_fp = wave.open(f, 'wb')
wav_fp.setnchannels(1)
wav_fp.setsampwidth(pyaudio.get_sample_size(pyaudio.paInt16))
wav_fp.setframerate(RATE)
wav_fp.writeframes(''.join(frames))
wav_fp.close()
f.seek(0)
# check if PERSONA was said
transcribed = self.passive_stt_engine.transcribe(f)

if any(PERSONA in phrase for phrase in transcribed):
return (THRESHOLD, PERSONA)

return (False, transcribed)

def activeListen(self, THRESHOLD=None, LISTEN=True, MUSIC=False):
"""
Records until a second of silence or times out after 12 seconds
Expand Down
1 change: 0 additions & 1 deletion client/modules/MPDControl.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,6 @@ def __init__(self, PERSONA, mic, mpdwrapper):
music_stt_engine = mic.active_stt_engine.get_instance('music', phrases)

self.mic = Mic(mic.speaker,
mic.passive_stt_engine,
music_stt_engine)

def delegateInput(self, input):
Expand Down
Empty file added client/snowboy/__init__.py
Empty file.
35 changes: 35 additions & 0 deletions client/snowboy/demo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import snowboydecoder
import sys
import signal

interrupted = False


def signal_handler(signal, frame):
global interrupted
interrupted = True


def interrupt_callback():
global interrupted
return interrupted

if len(sys.argv) == 1:
print("Error: need to specify model name")
print("Usage: python demo.py your.model")
sys.exit(-1)

model = sys.argv[1]

# capture SIGINT signal, e.g., Ctrl+C
signal.signal(signal.SIGINT, signal_handler)

detector = snowboydecoder.HotwordDetector(model, sensitivity=0.5)
print('Listening... Press Ctrl+C to exit')

# main loop
detector.start(detected_callback=snowboydecoder.play_audio_file,
interrupt_check=interrupt_callback,
sleep_time=0.03)

detector.terminate()
Binary file added client/snowboy/resources/common.res
Binary file not shown.
Binary file added client/snowboy/resources/ding.wav
Binary file not shown.
Binary file added client/snowboy/resources/dong.wav
Binary file not shown.
Loading