Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added paragraph on network adapters to troubleshooting guide #3439

Merged
merged 2 commits into from
Jan 20, 2025

Conversation

Ricc68
Copy link
Contributor

@Ricc68 Ricc68 commented Jan 18, 2025

As per @Nerivec request in #3360 (comment), I have written some guidance about network adapters in the troubleshooting guide.

@Nerivec
Copy link
Contributor

Nerivec commented Jan 18, 2025

Wondering if that paragraph should be moved to https://www.zigbee2mqtt.io/advanced/zigbee/02_improve_network_range_and_stability.html#usb-based-adapter
to be more in-line with the rest of the page?

@Ricc68
Copy link
Contributor Author

Ricc68 commented Jan 18, 2025

Hmmm, that page refers to Zigbee adapters, routers and network while my explanation refers to LAN adapters and network.
I think that moving it to that page may cause confusion, in the reader, between LAN and Zigbee networks which are different and not interchangeable.

@Koenkk
Copy link
Owner

Koenkk commented Jan 19, 2025

Not having this correct makes z2m crash right? So I guess the page is correct. Maybe it would be good to describe what errors can be seen if this is not correct (like the Ember timeout errors)

@Ricc68
Copy link
Contributor Author

Ricc68 commented Jan 19, 2025

The last commit addresses @Koenkk observation.

where a timeout occurred on the serial-over-IP protocol, or:

```
[2024-06-24 03:37:24] warning: zh:ember:uart:ash: Frame(s) in progress cancelled in [1ac1020b0a527e]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not produce this one.
But then again, weird network behaviors can... see Koenkk/zigbee2mqtt#23120 (comment) 😨

@Ricc68
Copy link
Contributor Author

Ricc68 commented Jan 19, 2025

What I see is an ASH ACK followed by an ASH NAK from 251 (SLZB) to 112 (how's that the firmware sends two answers like that, where are the requests from 112?), then, after the NAK, 112 closes the connection (112 is Zigbee2MQTT). Then after 2 seconds 112 reopens the connection and issues an ASH Reset, but 251 answers again ASH NAK.
No wonder this communication is garbled, 251 insisting to reply ASH NAK: probably the firmware ASH parser got somehow out of sync in the SLZB.

This may very well be caused by weird network behaviour, especially if some serial message is corrupted or lost in the stream (packet loss, etc).

This may have many causes, loops on the network as said in that message, but also by unstable connection caused by hardware instabilities or failures like I have described in my PR.

Anyway hats off to @sorenfalch for the nice analysis, including not only the TCP dump but also the dissector that helped understand how much the SLZB firmware was confused.

@Koenkk
Copy link
Owner

Koenkk commented Jan 20, 2025

@Nerivec ready for merge?

@Ricc68
Copy link
Contributor Author

Ricc68 commented Jan 20, 2025

Just to give a little more context, typically an embedded network stack running on an MCU like the SLZB is inherently limited by the very low resources of the MCU.
Retransmissions are usually a hint that packets are being discarded: see https://docs.espressif.com/projects/esp-faq/en/latest/software-framework/protocols/lwip.html#what-happens-when-esp8266-receives-a-tcp-out-of-order-message
If the network stack is discarding packets, the serial stream breaks and things start to go wrong with the parser of the serial protocol in the MCU.

There's not much that can be done on the MCU side to fix this (actually nothing) because of the low amount of network buffers available (in the majority of the cases there is just one network buffer, to save RAM, therefore a retransmission means that packets are surely being discarded).
That's why it is better to have an usb dongle rather than a networked antenna like the SLZB.

@Koenkk Koenkk merged commit 3033f2b into Koenkk:dev Jan 20, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants