Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release conflicted ip of stateless workload to trigger assigning a new one #3081

Merged

Conversation

Icarus9913
Copy link
Collaborator

@Icarus9913 Icarus9913 commented Jan 12, 2024

Once starting a stateless Pod, the spiderpool will reuse the first allocated IPs from the SpiderEndpoint resource for the Pod if it met some errors. If the coordinator detected the IP conflict, spiderpool will still reuse the conflicted IPs. With this, the Pod would never starts.

So, I added a feature that let coordinator send a signal to spiderpool-agent component to release its whole NICs IPs and let the spiderpool reallocate IPs for the Pod. It might let the Pod starts correctly.

Signed-off-by: Icarus9913 [email protected]

@Icarus9913 Icarus9913 added pr/not-ready not ready for merging release/feature-new release note for new feature labels Jan 12, 2024
Copy link

codecov bot commented Jan 12, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (46b150e) 81.07% compared to head (c7e0ce3) 81.12%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3081      +/-   ##
==========================================
+ Coverage   81.07%   81.12%   +0.05%     
==========================================
  Files          50       50              
  Lines        5358     5373      +15     
==========================================
+ Hits         4344     4359      +15     
  Misses        856      856              
  Partials      158      158              
Flag Coverage Δ
unittests 81.12% <100.00%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...orkloadendpointmanager/workloadendpoint_manager.go 100.00% <100.00%> (ø)

@Icarus9913 Icarus9913 force-pushed the feat/wk/ip-conflict-release branch 2 times, most recently from 1f20326 to d07922c Compare January 15, 2024 10:06
@weizhoublue
Copy link
Collaborator

weizhoublue commented Jan 15, 2024

(1) 这应该在 相关 doc 中有说明:对于 无状态 pod,XXXX
(2) 有用例闭环

@weizhoublue weizhoublue changed the title add ip release for ip conflict situation release conflict ip of stateless workload to trigger assigning a new one Jan 15, 2024
@weizhoublue weizhoublue changed the title release conflict ip of stateless workload to trigger assigning a new one release conflicted ip of stateless workload to trigger assigning a new one Jan 15, 2024
@Icarus9913 Icarus9913 force-pushed the feat/wk/ip-conflict-release branch 2 times, most recently from 8400439 to 098b161 Compare January 18, 2024 09:09
@Icarus9913 Icarus9913 force-pushed the feat/wk/ip-conflict-release branch 4 times, most recently from 37e77f5 to 7efdb15 Compare January 19, 2024 02:57
@Icarus9913 Icarus9913 added pr/ready-review This pull is ready for review and removed pr/not-ready not ready for merging labels Jan 19, 2024
@Icarus9913 Icarus9913 force-pushed the feat/wk/ip-conflict-release branch from 7efdb15 to c2d1c72 Compare January 19, 2024 09:38
@@ -57,7 +57,8 @@ For more information about the Underlay Pod not being able to access the Cluster
## Detect Pod IP conflicts(alpha)

IP conflicts are unacceptable for underlay networks, which can cause serious problems. When creating a pod, we can use the `coordinator` to detect whether the IP of the pod conflicts, and support both IPv4 and IPv6 addresses. By sending an ARP or NDP probe message,
If the MAC address of the reply packet is not the pod itself, we consider the IP to be conflicting and reject the creation of the pod with conflicting IP addresses:
If the MAC address of the reply packet is not the pod itself, we consider the IP to be conflicting and reject the creation of the pod with conflicting IP addresses.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个翻译过来的有点 中式
If the MAC address of the reply packet is belonged to the pod itself, the IP address is recognized as conflicted ....
@windsonsea 给一些建议

log := logutils.FromContext(ctx)

var pod *corev1.Pod
// with IP release, we must need Pod UID
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it needs to offer Pod UID when releasing ip

return fmt.Errorf("failed to get the top controller of the Pod %s/%s, error: %v", pod.Namespace, pod.Name, err)
}

// do not release conflict IPs for stable Pod
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not release conflict IPs for stateful Pod

return nil
}
} else {
log.Warn("EnableReleaseConflictIPs is disabled, skip to release IPs")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there IP information in the log ?

}
}

// release SpiderEndpoint IPs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

release IP for stateless endpoint

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

该接口为统一接口,不区分是否为stateless

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seems to be dedicated for stateless endpoint after line 257 , so I mean that the comment could be the following, and help code reviewer a lot

// release IP for stateless endpoint

@Icarus9913 Icarus9913 force-pushed the feat/wk/ip-conflict-release branch from c2d1c72 to 62004c2 Compare January 23, 2024 09:42
output, err := frame.DockerExecCommand(ctx, common.VlanGatewayContainer, commandV6Str)
Expect(err).NotTo(HaveOccurred(), "Failed to exec %s for Node %s, error is: %v, log: %v", commandV6Str, common.VlanGatewayContainer, err, string(output))
}
})
Copy link
Collaborator

@weizhoublue weizhoublue Jan 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need a case for stateful pod whose IP should not be released ? add to the test document and it could be implemented next month

@Icarus9913 Icarus9913 force-pushed the feat/wk/ip-conflict-release branch from 62004c2 to 4ee16d8 Compare January 29, 2024 06:05
@Icarus9913 Icarus9913 force-pushed the feat/wk/ip-conflict-release branch from 4ee16d8 to 14e2658 Compare January 29, 2024 06:59
weizhoublue
weizhoublue previously approved these changes Jan 29, 2024
@Icarus9913 Icarus9913 force-pushed the feat/wk/ip-conflict-release branch 3 times, most recently from d5d4dd4 to cb55e3d Compare January 30, 2024 01:44
@Icarus9913 Icarus9913 force-pushed the feat/wk/ip-conflict-release branch from cb55e3d to c7e0ce3 Compare January 30, 2024 05:57
@Icarus9913 Icarus9913 linked an issue Jan 30, 2024 that may be closed by this pull request
@weizhoublue weizhoublue merged commit e03f8ff into spidernet-io:main Jan 30, 2024
45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/ready-review This pull is ready for review release/feature-new release note for new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IP reclaim for various scenarios
3 participants