-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WAN interface Losing IPv6 connectivity after 30-60 seconds since 25.1.1 upgrade (stateless NDP?) #242
Comments
As discussed I chased the test patch from a while back to verify this is a kernel issue. It's ee7b012c54ae04 and I just need to build a kernel on top of the stable/25.1 branch to provide a matching test kernel. Not sure if that will happen today, but sharing the plan seems like a good thing to do. :) |
Ok here is the test kernel:
(needs a reboot to activate, if you want to go back just do Cheers, |
I think you meant I tested as follows: 1.) Reset Outcome: IPv6 on a SLAAC WAN interface is behaving as expected: The IPv6 connection is stable after reboot and is not timing out after 30 odd seconds (or less). Tested with a simple It therefore appears that NDP is working as expected to maintain the connection without the manual fixes in place that I mentioned in my report. Clients configured with a ULA utilizing NAT66 on the firewall to reach the public IPv6 space can do so without issue; the firewall itself has stable access to the public IPv6 internet via the Zyxel 5G modem. Eg: Behavior is as it was in Opnsense 24.7.12_2. |
@funtowne long day, sorry... thanks for that. Let me think about how to proceed. The easiest steps are either making the ICMPv6 requirements rules stateless or adding this patch as an adjustable sysctl. The better way forward would be debugging the state tracking but that will take some time so an interim solution would be nice. Cheers, |
No need to apologize. I wasn’t expecting a patch so fast! It is an upstream change that you’d be fighting potentially indefinitely, no? SLAAC on WAN is probably pretty uncommon, given the intent of IPv6 addressing. Wouldn't some documentation suffice, or a flag in the code for something like: `if WAN IPv6 = SLAAC then workaround()`? |
Crap, fat fingered the GitHub UI on mobile. I didn’t mean to close this! |
Just asking because I have been there and done that: Did you use "bridge-mcsnoop 0" on the WAN's bridge interface or did "echo -n 0 > /sys/class/net//bridge/multicast_snooping"? I found out when I thought I had found a sure-fire way to make these ND problems reproducable via "ndisc6 -m -n -r 1 fe80::xxxxx eth0" from a Linux client, see: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281395 and https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281397 If you have not set that specific parameter, your findings are probably worthless because of a well-known bug in Linux: https://forum.proxmox.com/threads/ipv6-neighbor-solicitation-not-forwarded-to-vm.96758/ That one shows as shortly after booting the VM, the neighbor discoveries will pass, but later on, they will get supressed. |
Hi @meyergru -- I went quite deep down that rabbit hole. Good to call it out, though! Here's my /etc/network/interfaces for the relevant bridge that connects to "WAN": iface vmbr1 inet manual In short, only either the fix released by Franco --or-- my workaround would work regardless of the multicast snooping or other similar settings. |
Important notices
Before you add a new report, we ask you kindly to acknowledge the following:
Describe the bug
Zyxel 5G modem, IP Passthru mode (for v4); SLAAC for IPv6. Both to Opnsense WAN
On 24.7.x, I was able to have a stable IPv6 connection from the WAN interface of my Opnsense VM, which was pulling the V6 address from the Zyxel 5G modem via SLAAC. I would then NAT this connection via NAT66 to my LAN interfaces, each of which assigned a static ULA /64 range. I know NAT66 is naughty, but hey it works.
Since upgrading to 25.1 (and 25.1.1), I have been unable to have a stable v6 connection on the WAN side for more than a few pings. Running the command ndp -nc on the opnsense VM restores IPv6 via the WAN interface for a few seconds, but v6 pings fail again after a few seconds... usually about 5-8 seconds after running the command.
Tested versions:
24.7.12_2-amd64 -- WAN_SLAAC is only configured with a monitoring IP, no other configuration; IPv6 tunables are defaults. WAN IPv6 works as expected on opnsense.
25.1 and 25.1.1-amd64 -- same configuration would not respond to neighbor discoveries on WAN interface - a first discovery would work, but subsequent would appear to fail... If I am understanding the packet dump correctly. Configuring WAN_SLAAC with the gateway by hand and setting net.inet6.icmp6.nd6_onlink_ns_rfc4861 to 1 and rebooting fixed IPv6 - the connection stays active.
Per a reddit thread with Franco, it looks like there's need for a "Stateless ICMP ND" patch to prevent pf from interfering with this particular setup. I'm opening this bug report on his request (THANK YOU!)
To Reproduce
Steps to reproduce the behavior:
1.) Configure WAN to SLAAC
2.) Attempt to use any IPv6, connectivity to the public internet will always time out. Default gateway can be pinged by hand, however.
Expected behavior
As with 24.7.12, configuring WAN to SLAAC should keep a stable IPv6 connection without any additional manual intervention like setting a static gateway or other sysctls.
Describe alternatives you considered
N/A, I went a bit bananas getting this far!
Screenshots
Screenshots.zip
Environment
Software version used and hardware type if relevant, e.g.:
OPNsense 24.7.12_2 and 25.1 and 25.1.1
Intel J3455-based Mini PC (Compulab Fitlet 2)
Proxmox Hypervisor at latest patchset, Opnsense as a VM
2x VirtIO network
interfaces, 4 VLANs LAN-side
Zyxel 5G modem; O2 Germany SIM Card
The text was updated successfully, but these errors were encountered: