migrate mww to esphome_audio, bring back volume control and media_pla…

…yer (#39) * WIP - migrate to esphome_audio so we can have make better use of audio chips on board and have media_player functionality * we do need to wait for media_player as on_end works differently than the needed on on_tts_stream_end when we use speaker * bring back volume control. * addressing feedback w/r/t naming conventions and whitespace * if media player is playing, stop it when wake word is detected * save/restore volume between reboots * Revert "save/restore volume between reboots" This reverts commit 76a103f. * Make it easier to resume playback after VA activity * README updates --------- Co-authored-by: Tudor Sandu <[email protected]>
tetele · May 9, 2024 · e8f276f · e8f276f
1 parent 45c9b4d
commit e8f276f
Show file tree

Hide file tree

Showing 2 changed files with 161 additions and 29 deletions.
diff --git a/README.md b/README.md
@@ -18,10 +18,10 @@ The config is distributed under the **MIT** License. See [`LICENSE`](LICENSE) fo
 
 - wake word (including [microWakeWord](https://www.esphome.io/components/micro_wake_word) in an experimental phase), push to talk, on-demand and continuous conversation support
 - response playback
-- audio media player (not supported by microWakeWord config yet)
+- audio media player
 - service exposed in HA to start and stop the voice assistant from another device/trigger
 - visual feedback of the wake word listening/audio recording/success/error status via the Mini's onboard top LEDs
-- uses all 3 of the original Mini's touch controls as volume controls and a means of manually starting the assistant and setting the volume (volume supported only for non-microWakeWord variant)
+- uses all 3 of the original Mini's touch controls as volume controls and a means of manually starting the assistant and setting the volume
 - uses the original Mini's microphone mute button to prevent the wake word engine from starting unintendedly
 - automatic continuous touch control calibration
 
@@ -35,7 +35,7 @@ The config is distributed under the **MIT** License. See [`LICENSE`](LICENSE) fo
 
 - you have to be able to retrofit an Onju Voice PCB inside a 2nd generation Google Nest Mini.
 - ESPHome currently can't use the I2S bus for both listening and playing **simultaneously**. As such, if you want to stream audio (like a TTS notification) to the Onju, you **need** to stop wake word listening first
-- the version for `microWakeWord` is in BETA and probably full of bugs
+- the version for `microWakeWord` is in BETA and probably full of bugs (please [report them](https://github.com/tetele/onju-voice-satellite/issues/new?assignees=&labels=bug&projects=&template=bug.yml) if you find any)
 
 ## Installation instructions
 
@@ -53,10 +53,12 @@ After the device has been added to ESPHome, if auto discovery is turned on, the
 
 - obviously, a huge thanks to [Justin Alvey](https://twitter.com/justLV) (@justLV) for the excellent Onju Voice project
 - many thanks to Mike Hansen ([@synesthesiam](https://github.com/synesthesiam)) for the relentless work he's put into [Year of the Voice](https://www.home-assistant.io/voice_control/) at Home Assistant
+- thanks to [@kahrendt](https://github.com/kahrendt) for [microWakeWord](https://github.com/kahrendt/microWakeWord)
+- thanks to [@gnumpi](https://github.com/gnumpi) for migrating the ESPHome [`media_player` component to ESP-IDF](https://github.com/gnumpi/esphome_audio)
 - thanks to [Klaas Schoute](https://github.com/klaasnicolaas) for helping with a creating a microsite for the automatic installation of this config (still experimental)
 - thanks to the [ESPHome Discord server](https://discord.gg/KhAMKrd) members for both creating the most time saving piece of software ever and for helping out with some kinks with the config - in particular @jesserockz, @ssieb, @Hawwa, @BigBobba
 
-[![GithubSponsor][githubsponsorbadge]][githubsponsor]
+If you'd like to thank me for creating and maintaining this config, you can [![GithubSponsor][githubsponsorbadge]][githubsponsor]
 
 [githubsponsor]: https://github.com/sponsors/tetele/
 [githubsponsorbadge]: https://img.shields.io/badge/sponsor%20me%20on%20github-sponsor-yellow.svg?style=for-the-badge
diff --git a/esphome/onju-voice-microwakeword.yaml b/esphome/onju-voice-microwakeword.yaml
@@ -3,6 +3,12 @@ substitutions:
   friendly_name: "Onju Voice Satellite"
   project_version: "1.0.0"
   device_description: "Onju Voice Satellite with ESPHome software and microWakeWord"
+external_components:
+  - source:
+      type: git
+      url: https://github.com/gnumpi/esphome_audio
+      ref: main
+    components: [ adf_pipeline, i2s_audio ]
 
 esphome:
   name: "${name}"
@@ -52,6 +58,11 @@ esp32:
   board: esp32-s3-devkitc-1
   framework:
     type: esp-idf
+    version: recommended
+    sdkconfig_options:
+      # need to set a s3 compatible board for the adf-sdk to compile
+      # board specific code is not used though
+      CONFIG_ESP32_S3_BOX_BOARD: "y"
 
 psram:
   mode: octal
@@ -121,35 +132,84 @@ interval:
           id: calibrate_touch
           button: 2
 
+
 i2s_audio:
-  - i2s_lrclk_pin: GPIO13
+  - id: i2s_shared
+    i2s_lrclk_pin: GPIO13
     i2s_bclk_pin: GPIO18
+    access_mode: duplex
 
-micro_wake_word:
-  model: okay_nabu
-  # model: hey_jarvis
-  # model: alexa
-  on_wake_word_detected:
-    - voice_assistant.start:
-        wake_word: !lambda return wake_word;
 
-speaker:
+adf_pipeline:
   - platform: i2s_audio
-    id: onju_out
-    dac_type: external
+    type: audio_out
+    id: adf_i2s_out
+    i2s_audio_id: i2s_shared
     i2s_dout_pin: GPIO12
+    sample_rate: 16000
+    adf_alc: true
+    bits_per_sample: 32bit
+    fixed_settings: true
 
-microphone:
   - platform: i2s_audio
-    id: onju_microphone
+    type: audio_in
+    id: adf_i2s_in
+    i2s_audio_id: i2s_shared
     i2s_din_pin: GPIO17
-    adc_type: external
+    channel: right
     pdm: false
+    sample_rate: 16000
+    bits_per_sample: 32bit
+    fixed_settings: true
+
+
+microphone:
+  - platform: adf_pipeline
+    id: onju_microphone
+    keep_pipeline_alive: true
+    gain_log2: 3
+    pipeline:
+      - adf_i2s_in
+      - self
+
+media_player:
+  - platform: adf_pipeline
+    id: onju_out
+    name: None
+    internal: false
+    keep_pipeline_alive: true
+    pipeline:
+      - self
+      - resampler
+      - adf_i2s_out
+    on_state: 
+      then:
+        - lambda: |-
+            static float old_volume = -1;
+            float new_volume = id(onju_out).volume;
+            if(abs(new_volume-old_volume) > 0.0001) {
+              if(old_volume != -1) {
+                id(show_volume)->execute();
+              }
+            }
+            old_volume = new_volume;
+
+micro_wake_word:
+  model: okay_nabu
+  # model: hey_jarvis
+  # model: alexa
+  on_wake_word_detected:
+    - if:
+        condition: media_player.is_playing
+        then:
+          - media_player.pause
+    - voice_assistant.start:
+        wake_word: !lambda return wake_word;
 
 voice_assistant:
   id: va
   microphone: onju_microphone
-  speaker: onju_out
+  media_player: onju_out
   use_wake_word: false
   on_listening:
     - light.turn_on:
@@ -174,7 +234,11 @@ voice_assistant:
         red: 20%
         green: 100%
         effect: speaking
-  on_tts_stream_end:
+  on_end:
+    - delay: 200ms
+    - wait_until:
+        not:
+          media_player.is_playing: onju_out
     - script.execute: reset_led
     - if:
         condition:
@@ -248,11 +312,47 @@ binary_sensor:
     id: volume_down
     pin: GPIO4
     threshold: 539000
+    on_press:
+      then:
+        - light.turn_on: left_led
+        - script.execute:
+            id: set_volume
+            volume: -0.05
+        - delay: 750ms
+        - while:
+            condition:
+              binary_sensor.is_on: volume_down
+            then:
+              - script.execute:
+                  id: set_volume
+                  volume: -0.05
+              - delay: 150ms
+    on_release:
+      then:
+        - light.turn_off: left_led
 
   - platform: esp32_touch
     id: volume_up
     pin: GPIO2
     threshold: 580000
+    on_press:
+      then:
+        - light.turn_on: right_led
+        - script.execute:
+            id: set_volume
+            volume: 0.05
+        - delay: 750ms
+        - while:
+            condition:
+              binary_sensor.is_on: volume_up
+            then:
+              - script.execute:
+                  id: set_volume
+                  volume: 0.05
+              - delay: 150ms
+    on_release:
+      then:
+        - light.turn_off: right_led
 
   - platform: esp32_touch
     id: action
@@ -270,9 +370,9 @@ binary_sensor:
                 format: "Voice assistant is running: %s"
                 args: ['id(va).is_running() ? "yes" : "no"']
             - if:
-                condition: speaker.is_playing
+                condition: media_player.is_playing
                 then:
-                  - speaker.stop
+                  - media_player.stop
             - if:
                 condition: voice_assistant.is_running
                 then:
@@ -284,9 +384,9 @@ binary_sensor:
                 tag: "action_click"
                 format: "Voice assistant was running with wake word detection enabled. Starting continuously"
             - if:
-                condition: speaker.is_playing
+                condition: media_player.is_playing
                 then:
-                  - speaker.stop
+                  - media_player.stop
             - voice_assistant.stop
             - delay: 1s
             - script.execute: reset_led
@@ -337,6 +437,23 @@ light:
           name: slow_pulse
           transition_length: 1s
           update_interval: 2s
+      - addressable_lambda:
+          name: show_volume
+          update_interval: 50ms
+          lambda: |-
+            int int_volume = int(id(onju_out).volume * 100.0f * it.size());
+            int full_leds = int_volume / 100;
+            int last_brightness = int_volume % 100;
+            int i = 0;
+            for(; i < full_leds; i++) {
+              it[i] = Color::WHITE;
+            }
+            if(i < 4) {
+              it[i++] = Color(64, 64, 64).fade_to_white(last_brightness*256/100);
+            }
+            for(; i < it.size(); i++) {
+              it[i] = Color(64, 64, 64);
+            }
       - addressable_twinkle:
           name: listening_ww
           twinkle_probability: 1%
@@ -398,6 +515,24 @@ script:
       - lambda: id(notification) = false;
       - script.execute: reset_led
 
+  - id: set_volume
+    mode: restart
+    parameters:
+      volume: float
+    then:
+      - media_player.volume_set:
+          id: onju_out
+          volume: !lambda return clamp(id(onju_out).volume+volume, 0.0f, 1.0f);
+
+  - id: show_volume
+    mode: restart
+    then:
+      - light.turn_on:
+          id: top_led
+          effect: show_volume
+      - delay: 1s
+      - script.execute: reset_led
+
   - id: turn_on_wake_word
     then:
       - if:
@@ -407,11 +542,6 @@ script:
               - switch.is_on: use_wake_word
           then:
             - micro_wake_word.start
-            - if:
-                condition:
-                  speaker.is_playing:
-                then:
-                  - speaker.stop:
             - script.execute: reset_led
           else:
             - logger.log: