Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High RAM usage and unexpected restart of Home Assistant #1241

Open
pico1881 opened this issue Jul 3, 2024 · 32 comments
Open

High RAM usage and unexpected restart of Home Assistant #1241

pico1881 opened this issue Jul 3, 2024 · 32 comments
Assignees
Labels
bug Something isn't working

Comments

@pico1881
Copy link

pico1881 commented Jul 3, 2024

For a few days I have encountered the unexpected restart of Home Assistant, after some analysis I found that the problem was due to excessive use of RAM above 90%. After further analysis I saw that the RAM was being used by go2rtc (see screenshot), but I don't see any errors in the log.
After manually restart of go2rtc addon the RAM usage is about 2-5%
go2rtc ram

@porly1985
Copy link

I can confirm this behaviour. After restarting go2rtc the RAM usage is fine again.

image

@dagleaves
Copy link

dagleaves commented Jul 10, 2024

I'm not Alex, but I've been investigating similar issues

  1. What version of go2rtc are you on?
  2. Are you using audio on any of your cameras via a secondary audio ffmpeg stream?
  3. If 2 is yes, do you notice a large number of consumers on your stream when your memory usage is high?
  4. What brand of cameras are you using?

I would also double check that your log level is set to debug or trace if you aren't seeing errors. If you are on info, most errors won't get reported.

@alexandreghez
Copy link

Same problem :
I'm using go2RTC 1.9.4 i have a raspberry 4 and 3 cameras : 2 homekit (Aquara G2HPRO et EUFY) et 1 tapo C220
image

@dagleaves
Copy link

Same problem :

I'm using go2RTC 1.9.4 i have a raspberry 4 and 3 cameras : 2 homekit (Aquara G2HPRO et EUFY) et 1 tapo C220

Are you using an audio stream with it? I would encourage you to check the things I listed above to help identify the source of the issue

@porly1985
Copy link

porly1985 commented Jul 11, 2024

I'm not Alex, but I've been investigating similar issues

  1. What version of go2rtc are you on?
  2. Are you using audio on any of your cameras via a secondary audio ffmpeg stream?
  3. If 2 is yes, do you notice a large number of consumers on your stream when your memory usage is high?
  4. What brand of cameras are you using?

I would also double check that your log level is set to debug or trace if you aren't seeing errors. If you are on info, most errors won't get reported.

Hi dagleaves,

  1. 1.9.4 (WebRTC Camera + HA Addon)
  2. yes, all the audio goes through ffmpeg
  3. I will check it when it happens again, the high ram usage only occures from time to time in my case.
  4. 3 x Reolink + 1 x Tap C210

Thanks and greetings!

Update:

Just happened again.

image

This network is active all the time, even when cameras are not in use.

Here my camera config:

  camera.garten_main:
    - http://192.168.178.135/flv?port=1935&app=bcs&stream=channel0_main.bcs&user=XXXXXX&password=XXXXXXX#video=copy#audio=copy#audio=opus
    - ffmpeg:camera.garten_main#audio=opus

@AlexxIT AlexxIT added the bug Something isn't working label Jul 13, 2024
@AlexxIT AlexxIT self-assigned this Jul 13, 2024
@AlexxIT
Copy link
Owner

AlexxIT commented Jul 13, 2024

Memory graph screenshots are not much help to understand and detect the problem.
The config part is already better. I don't know if it will be enough for the repeating of the problem.

@pico1881
Copy link
Author

pico1881 commented Jul 14, 2024

In my case I solved removing this entry from config:
"camera.camera1_sub": ffmpeg:{input}#video=copy#audio=opus#audio=aac

EDIT: the problem persists and causes HA reboot due to high RAM usage. At the moment I have uninstalled go2rtc add-on and I'm trying to use only WebRTC Camera custom component

@alexandreghez
Copy link

this is my configuration for streams part :

streams:
  eufy:
    - homekit://[IP]:46778?device_id=MACADDR&device_public=[PUBLIC]&client_id=[CLIENT_ID]&client_private=[CLIENT_PRIVATE]
   - ffmpeg:eufy#audio=aac#audio=opus
  aqara: 
    - homekit://[IP]:37833?device_id=MACADDR&device_public=[PUBLIC]&client_id=[CLIENT_ID]&client_private=[CLIENT_PRIVATE]
    - ffmpeg:aqara#audio=aac#audio=opus#bitrate=299K
  tapo: 
    - tapo://admin:[PASSWORD]@[IP]
    - ffmpeg:tapo#audio=aac

@1andrevich
Copy link

go2rtc version=1.9.4 platform=linux/amd64 revision=a4885c2
the problem occurs with irregular frequency, as if something triggers this behavior, HA reboot helped last time, but it's hard to determine what is the trigger here. There are 1 doorbell with 2 way audio, 2 reolink cameras on Wi-Fi and 8 channels from Dahua NVR all shown in WebRTC cards

go2rtc.yaml:

api:
rtsp:
webrtc:
  listen: ":8555"
  candidates:
streams:
  reodoor_sub:
    - "ffmpeg:http://IP/flv?port=1935&app=bcs&stream=channel0_sub.bcs&user=admin&password=PASSWORD#video=copy#audio=copy#audio=opus"
    - rtsp://admin:PASSWORD@IP:554/h264Preview_01_sub
    - ffmpeg:dahua1#video=h264#hardware  # if your camera doesn't support H264, important for HomeKit
    - ffmpeg:dahua1#audio=opus           # only OPUS audio supported by HomeKit
  reodoor_main:
    - "ffmpeg:http://IP/flv?port=1935&app=bcs&stream=channel0_main.bcs&user=admin&password=PASSWORD#video=copy#audio=copy#audio=opus"
    - rtsp://admin:PASSWORD@IP:554/h264Preview_01_main

#homekit:
#reodoor_sub:
#pin: 19550224         # custom PIN, default: 19550224
#name: Reolink Doorbell      # custom camera name, default: generated from stream ID

go2rtc_memory_consumption
go2rtc_scheme

@pico1881
Copy link
Author

My actual "solution" in an automation that restart go2rtc addon when it use more than 20% of RAM

@Uriziel01
Copy link

Uriziel01 commented Oct 17, 2024

@pico1881 Could you share that automation please? Just discovered I'm also affected by this issue and my HA server is restarting every few days/weeks. Autiomation if anyone is wondering:

alias: Restart Frigate/go2rtc when memory leak is detected
description: ""
triggers:
  - trigger: numeric_state
    entity_id:
      - sensor.memory_use_percent #<- adjust this to your setup
    for:
      hours: 0
      minutes: 0
      seconds: 1
    attribute: value
    above: 60 #<- adjust this to your setup
conditions: []
actions:
  - action: hassio.addon_restart
    data:
      addon: ccab4aaf_frigate
mode: single

@pico1881
Copy link
Author

this is my automation:

alias: Restart go2rtc
description: ""
triggers:
  - trigger: numeric_state
    entity_id:
      - sensor.go2rtc_memory_use_percent
    for:
      hours: 0
      minutes: 0
      seconds: 30
    above: 20
conditions: []
actions:
  - action: hassio.addon_restart
    metadata: {}
    data:
      addon: a889bffc_go2rtc
mode: single

@SynaZe
Copy link

SynaZe commented Oct 31, 2024

For a few days I have encountered the unexpected restart of Home Assistant, after some analysis I found that the problem was due to excessive use of RAM above 90%. After further analysis I saw that the RAM was being used by go2rtc (see screenshot), but I don't see any errors in the log.
After manually restart of go2rtc addon the RAM usage is about 2-5%
go2rtc ram

I just started with home assistant, I suspect I also have an issue with go2rtc and memory usage. But how do you create this specific sensor to track the memory usage of go2rtc?

@pico1881
Copy link
Author

The memory usage sensor already exist for every addon, but are disabled. You have ti search go2rtc memory use percent on disabled entity and enable it

@SynaZe
Copy link

SynaZe commented Oct 31, 2024

The memory usage sensor already exist for every addon, but are disabled. You have ti search go2rtc memory use percent on disabled entity and enable it

Sorry for hijacking your post, if I search in the entities I can't find the sensor you mention.
image
It's it because I'm missing some specific integration/addon?

Update: I figured it out I had the integration webrtc camera but I didn't have the go2rtc addon.

@pico1881
Copy link
Author

The memory sensor Is available with go2rtc addon, not using custom component

@coolguy45
Copy link

coolguy45 commented Nov 20, 2024

Any updates on a fix? I am having the same issue. Using HAOS, and the Frigate Full access add-on (version 0.14.1) that has go2rtc included (go2rtc version 1.9.2). System runs out of memory very quickly after enabling Frigate add-on (I have to keep it disabled for now so the system won't crash). I'm using go2rtc to restream a single RTSP camera to Frigate.
frigate

@AlexxIT
Copy link
Owner

AlexxIT commented Nov 20, 2024

The issue depends on the environment. Possibly on cameras, models and their firmwares. Possibly on problems at the network level.

I can't get anything like this on my two dozen test streams.

@dankarization
Copy link

So what can we do to help fix that?

@AlexxIT
Copy link
Owner

AlexxIT commented Nov 20, 2024

The first step to fixing any problem is to repeat it. Need to understand exactly what is causing the problem. Ideally, run go2rtc on a separate home PC. Only with one stream of a certain type. And make sure that it is this stream that is causing the problem.

I've heard a similar claim about an RTMP stream from a Reolink camera. I will try to replicate it on my camera.

But I see above that this occurs with the RTSP stream as well.

@TekFan
Copy link

TekFan commented Nov 21, 2024

So, I'm in the same situation...

These are my observations since the upgrade to 2024.11:

  • context: Before upgrade, I was using the WebRTC native integration and the Go2RTC server add-on and everything was working perfectly.
    I'm using 4 old Axis cameras via Axis integration and 4 Reolink cameras via Reolink integration and I'm on HA-OS running on a RPi5.

  • Issue: After upgrading to 2024.11, I obviously stopped the Go2RTC add-on and disabled the previous WebRTC integration.
    At first, the new core Go2RTC platform seemed to work as advertised but it took only a few hours before HA crashed.
    After a second crash, I noticed that this was due to an explosive memory leak.
    Because I simultaneously observed a camera picture failure just before the two crashes (heavy blocking and green pictures) I immediately investigated the new Go2RTC implementation.
    So what I did, is restart the Go2RTC server add-on (running on the same RPi, remember, it was stopped), pointed the Go2RTC integration to the external server via configuration.yaml and restarted HA.

Since that, HA is stable, no memory leak anymore, but now this memory leak has been transferred to the Go2RTC server add-on, which is at least more manageable as I simply restart the add-on constantly via an automation.

So, first clue: the memory leak is definitely on the server side.

Now, what I could already observe is that this memory leak seems to occur at a moment where another issue appears, namely a stream deadlock.
During all these investigations I saw that I was rapidly losing the RTC stream and all my camera were reverting, one after the other, to standard HLS streaming.
When this happens, I see there is a rogue stream left open in the GoRTC server UI.
This stream seems to still be connected to the camera but without output, and the amount of data transferred from the camera continues to increase, I don't know if these data are going somewhere or are simply discarded, but that could explain the leak if it still is making a buffer overflow.

Here a screenshot of the net flow for this unclosed stream:
Capture d’écran 2024-11-17 à 11 53 12

A consequence of this unclosed stream remaining there indefinitely is that this camera is blocked, I mean no new stream can be created and even in the Go2RTC console UI, trying to load a camera ends up in a black screen with perpetual "loading".
If I manualy delete this stream, normal operation resumes for this camera, until next occurrence of the issue.
This explains why my cameras soon or later revert to HLS as the integration cannot establish a WebRTC stream anymore.

So, I think that the fact that a stream can be left open without any output is not something that should be expected and so should be considered as a bug.
That said, I'm obviously not sure there actually is any relation between this bug and the explosive memory leak, it may be a coincidence.

And to conclude, I think the 2024.11 HA release has just integrated the Go2RTC add-on code in core, right ?
So this bug has been integrated as well which would explains why many users are now suddenly experiencing a memory leak and crashes in HA itself.

Sorry, was a bit long but I hope this helps.

@AlexxIT
Copy link
Owner

AlexxIT commented Nov 21, 2024

@TekFan Thanks. Very helpful. Have to repeat the situation when the stream doesn't close. This is definitely a issue. It shouldn't be.

@TekFan
Copy link

TekFan commented Nov 22, 2024

No worries.

Btw, I feel your pain, I've tried to reproduce this issue on my staging system and couldn't until yesterday where it happened for the first time in 2 weeks.
And the phenomenon is the same: as soon as a stream is stuck unclosed, memory leak starts and in this case it reached 2.5GB
This HA instance is super light, so completely different from my prod. system.
The only common things are that it also runs on a pi5 and obviously the cameras, but one notable difference is that on this staging instance, I run the internal Go2RTC integration AND server, not the add-on, so like I said previously, this problem happens whatever solution is used.

Now, I could get the info from a stuck stream in the debug console and I'm a bit puzzled by the "Childs" list which is significantly longer than a normal stream.
Go2RTC_info.txt

@dagleaves
Copy link

What stands out to me from that

  1. You aren't sending video. You have no real consumers, just the audio transcoding one. Maybe the real consumers left, so it tried to tear down the stream, hangs on trying to close non-existent audio child? (See 2)
  2. Your audio sender looks to be the original audio thread (25 -> 26). However, looking at the IDs of the audio producer's children, that child is believed to be removed and replaced by the parent producer (25 -> 75). 26 is likely hanging? Or still tracked somewhere it should've been removed?

Seems to be the same issue with Reolink audio #1254 possibly fixed by #1431

I can test #1431 with one of my Reolink locations to see if it resolves the issue. This is the same issue I run into with Reolink, it tries to close an already dead/closed audio child, hangs on the line in #1254. My fix in #1254 causes go2rtc to crash instead so it restarts itself. Hopefully #1431 fixes the issue cleanly

@TekFan
Copy link

TekFan commented Nov 22, 2024

Wow, because the first issue I noticed was the memory leak, I found this thread and didn't bother to look further since.
But, #1254 is indeed describing perfectly what happens to me.
I hope #1431 will be merged asap then, at least in the add-on, because this is driving me nuts...
Is there a way to beta test this PR ?

@seydx
Copy link
Contributor

seydx commented Nov 22, 2024

Wow, because the first issue I noticed was the memory leak, I found this thread and didn't bother to look further since. But, #1254 is indeed describing perfectly what happens to me. I hope #1431 will be merged asap then, at least in the add-on, because this is driving me nuts... Is there a way to beta test this PR ?

https://github.com/seydx/go2rtc/actions/runs/11920040811

@TekFan
Copy link

TekFan commented Dec 2, 2024

@seydx Sorry, was out for a few days.
What do I do with this file ?

@seydx
Copy link
Contributor

seydx commented Dec 2, 2024

@seydx Sorry, was out for a few days. What do I do with this file ?

scroll down and downlaod the right artefact and replace your go2rtc file

@TekFan
Copy link

TekFan commented Dec 3, 2024

I had already figured that out, but then where is located this file I have to replace with the one I have downloaded.

@maxgiordan
Copy link

Any news on a fix? I have the same problem. I use HAOS and reolink webcam with its integration. Core 2024.12.01, Supervisor 2024.11.4, OS 14.0.

The system runs out of memory very quickly and I had to insert an automation to reboot the system when the ram reaches 80% (in a few hours).
It all started with the advent of core 2024.11...and it's becoming really frustrating...
Thanks for the work you do, is it possible to understand well how to proceed with the replacement of the go2rtc file?
Screenshot 2024-12-08 114322

@TekFan
Copy link

TekFan commented Dec 19, 2024

@seydx We still miss instructions on how to install your fork.
@AlexxIT Do you have an eta for merging the seydx workaround PR ?

@AlexxIT
Copy link
Owner

AlexxIT commented Dec 19, 2024

No. Significant fixes should be carefully reviewed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests