Skip to content
This repository has been archived by the owner on Oct 26, 2022. It is now read-only.

Suspected memory leak in the controller #216

Open
ibaldin opened this issue Jul 23, 2018 · 4 comments
Open

Suspected memory leak in the controller #216

ibaldin opened this issue Jul 23, 2018 · 4 comments
Assignees
Labels

Comments

@ibaldin
Copy link
Contributor

ibaldin commented Jul 23, 2018

Information is reflected in RENCI-NRIG/exogeni#224 - we need to investigate further. Mert indicates memory consumption slowly increasing on controller over time.

Possibly related to TicketReviewPolicy.

@kthare10
Copy link
Contributor

kthare10 commented May 6, 2019

Based on Eclipse Memory Analyzer, 1.8Gb of memory is being consumed by PubSub thread. First review of the PubSub controller code does not indicate any issues in the code. Proposed to disable PubSub on ExoSM.

@kthare10
Copy link
Contributor

kthare10 commented May 6, 2019

Inputs from Mert:

Mert Cevik [2:03 PM]
Actually, XMPP daemon, database and all relevant stuff was already there for a long time. I know that OpenFire on control.exogeni.net has some issues as well. I see exceptions occasionally, and service is crashing as well. That's one of the reasons that I wanted to migrate some services out of control.exogeni.net and update them.
I need to remember the chain for this pubsub mechanism, I will do it very soon.
Also, one observation let me tell, there was a fix fairly recent around last year, I will find the relevant issue, about some manifests were not being published. Then, it became better, but I was noticing some missing publish events.
Setting up a new database, and new OpenFire sounds more appealing to me

But first, as you mentioned, disabling pubsub can be a better diagnose. Let me see about that.

@mcevik0
Copy link
Contributor

mcevik0 commented May 6, 2019

"ORCA pubsub properties" in geni.renci.org:/etc/orca/controller-11080/config/controller.properties (line 85-90) commented and controller JVM restarted.

@mcevik0
Copy link
Contributor

mcevik0 commented Nov 25, 2019

OpenFire server that was installed with the latest version was showing similar exceptions with the current instance. Careful investigation is needed to determine if the culprit is the OpenFire server process itself. Instead of switching the production XMPP OpenFire server to a new server, we may consider a test server and a pubsub instance pointed to that.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants