-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
With 256 BGP sessions between two devices, few are not established when the peer router is rebooted #17655
Comments
@donaldsharp FYI |
Here is the show bgp neighbor output in problem state on both devices for few neighbors Device 1 (Device that is not reloaded and has bgpd 100%)
Device 2 for the neighbor above
Device 1 second neighbor:
Device 2 2nd neighbor in problem state
|
Can we get the debug logging? |
bgp.summary.t1.log
|
Could you give some specific time when an actual restart happened? And as I understand this is happening ONLY if |
show bgp summary from t1 device in problem state
show bgp summary from t2 device in problem state
Adding the logs. Both these are taken after reboot command is executed on t1. So logs start from the moment of reboot. Adding with debug bgp graceful-restart |
Description
When there are 256 BGP sessions between two devices, when one of them is reboot and comes back up, few sessions are not established.
This issue is seen only with the below graceful restart config. Without this issue is not seen
In addition issue results in 100% utilization for bgpd
Attaching the running config.
show_run.txt
Version
How to reproduce
Create 256 sessions between two devices and reboot one of the device after sessions are established.
Expected behavior
All sessions should be established when peer device reboot
Actual behavior
Few of the 256 sessions are not established.
Additional context
Donald did some investigation and found the below
union sockunion *sockunion_getpeername(int fd) { int ret; socklen_t len; union { struct sockaddr sa; struct sockaddr_in sin; struct sockaddr_in6 sin6; char tmp_buffer[128]; } name; union sockunion *su; memset(&name, 0, sizeof(name)); len = sizeof(name); ret = getpeername(fd, (struct sockaddr *)&name, &len); if (ret < 0) { flog_err(EC_LIB_SOCKET, "Can't get remote address and port: %s", safe_strerror(errno)); return NULL; }
so return is 0 and we are getting name set to this:
(gdb) p name $14 = {sa = {sa_family = 0, sa_data = '\000' <repeats 13 times>}, sin = {sin_family = 0, sin_port = 0, sin_addr = {s_addr = 0}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = { sin6_family = 0, sin6_port = 0, sin6_flowinfo = 0, sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}, tmp_buffer = '\000' <repeats 127 times>} (gdb) p len $15 = 128
Checklist
The text was updated successfully, but these errors were encountered: