Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heartbeat Not Functioning Correctly for Large Node Sets in OPC Publisher 2.9.11 #2357

Open
aurovico opened this issue Oct 11, 2024 · 0 comments
Labels
bug Something isn't working
Milestone

Comments

@aurovico
Copy link

aurovico commented Oct 11, 2024

I'm currently using OPC Publisher version 2.9.11 with around 75,000 nodes configured in the publishednodes.json file. I'm encountering an issue where some nodes do not report their values for hours or even days, despite configuring the HeartbeatInterval both globally in the module's command-line arguments and for each individual node in the publishednodes.json.

Specifically, nodes that do not change their value frequently fail to report via the heartbeat mechanism, as if the heartbeat isn't functioning. However, nodes that change frequently are being reported correctly. What's unusual is that when I create a smaller publishednodes.json file containing only a few problematic nodes (e.g., 3 or 4 nodes), the heartbeat works perfectly, and the values are reported as expected, even for those infrequently changing nodes.

Environment Details:

  • OPC Publisher Version: 2.9.11
  • OPC Server: KepServer (latest version)
  • Node Configuration Example:
    { "Id": "ns=2;s=P_Test_Fabric1_Planta13_Prensa.PLC.Vacio.Parametros.Retardo", "DisplayName": "Test_Fabric1_Planta13_Prensa_Vacio_Parametros_Retardo", "OpcSamplingInterval": 5000, "OpcPublishingInterval": 5000, "HeartbeatInterval": 3499, "SkipFirst": false }
  • Command Line Arguments:
    { "image": "mcr.microsoft.com/iotedge/opc-publisher:2.9.11", "createOptions": "{\"Hostname\":\"OPCPublisherSilestone2\",\"Cmd\":[\"--ll=Information\",\"--pf=/appdata/publishednodes-silestone2.json\",\"--aa\",\"--ap=/appdata/pki/own\",\"--tp=/appdata/pki/trusted\",\"--rp=/appdata/pki/rejected\",\"--ip=/appdata/pki/issuer\",\"--si=500\",\"--ms=131072\",\"--hbb=WatchdogLKVWithUpdatedTimestamps\",\"--hb=1800\",\"--mm=FullSamples\",\"--npd=1000\",\"--eip=false\"],\"ExposedPorts\":{\"80/tcp\":{}},\"HostConfig\":{\"Binds\":[\"/home/aurovico/appdata:/appdata\"],\"PortBindings\":{\"80/tcp\":[{\"HostPort\":8084}]}}}" }

The heartbeat is expected to send values periodically for nodes that are not changing frequently, but this doesn't seem to work when a large number of nodes are being monitored. Upon restarting the module, all values are reported correctly, but over time the problem reoccurs. The OPC server (KepServer) is up-to-date and stable, and there are no indications of issues on the server side.

Could this be related to the large number of nodes being monitored by OPC Publisher? Is there a limit or internal queue that could be causing heartbeats not to trigger for infrequently changing nodes? Any insights or recommendations to troubleshoot or resolve this issue would be greatly appreciated.

Diagnostic Information:
DIAGNOSTICS INFORMATION for : <> (da39a3ee5e6b4b0d3255bfef95601890afd80709)

# OPC Publisher Version (Runtime) : 2.9.11.2+63bedd6b75 (.NET 8.0.8/linux-x64/OPC Stack 1.5.374.78)

# Cpu/Memory max : 6.00 | 6,044,955 KB

# Cpu/Memory available : 1.00 | 6,044,955 KB

# Cpu/Memory % used (window/total) : 708.79 % | 2,149.19 % (1,299,173 kb)

# Ingest duration (dd:hh:mm:ss)/Time : 00:00:09:50 | 2024-10-11T09:49:56.4119335+00:00

# Endpoints connected/disconnected : 3 | 0 (Connected)

# Connections created/retries : 3 | 0

# Subscriptions count : 89

# Good/Bad Monitored Items (Late) : 75,081 | 1,791 (0)

# Queued/Minimum request count : 89 | 6

# Good/Bad Publish request count : 89 | 0

# Heartbeats/Condition items active : 76,872 | 0

# Ingress value changes : 394,040 (All time ~667.70/s; 33,971 in last 60s ~566.18/s)

# Ingress sampled values : 0

# Ingress events : 0

# Ingress values/events unassignable : 0

# Server queue overflows : 0

# Received Data Change Notifications : 9,977 (All time ~16.91/s; 1,075 in last 60s ~17.92/s)

v# Received Event Notifications : 0

# Received Keep Alive Notifications : 411 (All time ~41.79/min)

# Received Cyclic read Notifications : 0

# Generated Heartbeat Notifications : 0

# Generated Model Changes : 0

# Publish queue partitions/active : 1 | 1

# Notifications buffered/dropped : 22 | 0

# Encoder input buffer size : 0

# Encoder Notif. processed/dropped : 9,956 | 0

# Encoder Network Messages produced : 2,285

# Encoder avg Notifications/Message : 4

# Encoder worst Message split ratio : 2

# Encoder avg Message body size : 90936.14 (Avg Chunk (4 KB) usage 22.20; 0.0/day estimated)

# Encoder output buffer size : 0

# Egress Messages queued/dropped : 0 | 0

# Egress Message send failures : 0

# Egress Messages successfully sent : 200 (0.34/s)

Let me know if you need any further modifications!

@marcschier marcschier added the bug Something isn't working label Oct 12, 2024
@marcschier marcschier modified the milestones: 2.9.12, 2.9.13 Oct 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants