-
Notifications
You must be signed in to change notification settings - Fork 640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scans are slow every 5 minutes, with a single retry #1156
Comments
Yes, that very much seems like something about the device. Best you can do is maybe tcpdump/wireshark the scrape to compare send and receive times to look for a response latency problem with the device. |
I agree, but debug info about which oid was retried would be great info here. |
I'm using a cut down version of the IF-MIB config, not sure if that is impacting this somehow:
|
We would need to have retry callback hooks in the SNMP library in order to get more per walk debugging. It may be possible, I would have to look at the library code. |
if this is possible, it would be great, I'm not sure how else I could easily determine which oid was failing. I may try to decode some of the snmp3 packets to try to see what was requested twice. |
Thank you very much, I was able to build that branch with "go build" and can run the output with "./snmp_exporter --snmp.debug-packets --log.level=debug". I've been running this build on my desktop for a while, which is accessing that problem device over a VPN, and in 10 minutes I haven't been able to record a duration greater than 600ms in the last 10 minutes. I have curls running with "watch -n 60" so it does one query a minute. Do you know what I could filter the output by to only see that particular new debug message? That may make it easier to catch this issue with this improved debugging. |
I would look for |
For whatever reason I'm not able to reproduce this on my desktop:
|
Yea, I typically don't recommend running SNMP over VPN/WAN links. SNMP, being UDP-based, is very sensitive to packet loss and latency. My standard recommendation is to run the snmp_exporter as close to the target devices as possible. |
The server with the issues is as close as it can be, it's under a router and then that router is attached to this switch. What is strange is that I'm not experiencing the issue over a VPN. The topology where the problem is observed: Docker container -> core router -> switch (snmp) The topology where no issue is observed: My desktop -> vpn -> management router -> switch (snmp) A different router is being used, but I've checked the core router and it doesn't report any dropped/error packets. |
snmp_exporter version: output of
snmp_exporter -version
snmp_exporter | ts=2024-04-11T15:09:38.160Z caller=main.go:194 level=info msg="Starting snmp_exporter" version="(version=0.25.0, branch=HEAD, revision=9c42d6c874d479314e612bca69558c81f8e26287)" concurrency=1
snmp_exporter | ts=2024-04-11T15:09:38.160Z caller=main.go:195 level=info build_context="(go=go1.21.5, platform=linux/amd64, user=root@880115266f70, date=20231210-10:05:18, tags=netgo)"
What device/snmpwalk OID are you using?
IF-MIB:
oid=1.3.6.1.2.1.2
If this is a new device, please link to the MIB(s).
It's a Netgear GS324TP S350
What did you do that produced an error?
Nothing different than what doesn't produce an error.
What did you expect to see?
What did you see instead?
When I check the response, I can see that it says it retied a single packet, but it's not clear what was retried. The logs don't make any mention of what failed. Similar behavior is being observed with a Lenovo Xclarity BMC, also happening every 5 minutes with a single retry.
The text was updated successfully, but these errors were encountered: