Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RayCluster][Feature] setup GCS FT annotations and the RAY_REDIS_ADDRESS env by the GcsFaultToleranceOptions #2721

Merged
merged 1 commit into from
Jan 9, 2025

Conversation

rueian
Copy link
Contributor

@rueian rueian commented Jan 9, 2025

Why are these changes needed?

This PR addresses the following selected part:
image

Related issue number

Resolves #2720

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@@ -96,6 +96,9 @@ func initTemplateAnnotations(instance rayv1.RayCluster, podTemplate *corev1.PodT
podTemplate.Annotations[utils.RayExternalStorageNSAnnotationKey] = v
}
}
if options := instance.Spec.GcsFaultToleranceOptions; options != nil && options.ExternalStorageNamespace != "" {
podTemplate.Annotations[utils.RayExternalStorageNSAnnotationKey] = options.ExternalStorageNamespace
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Override the ray.io/external-storage-namespace annotation if the new ExternalStorageNamespace is specified. We rely on this annotation to set the RAY_external_storage_namespace env in the later setContainerEnvVars function.

@@ -322,7 +325,7 @@ func initLivenessAndReadinessProbe(rayContainer *corev1.Container, rayNodeType r
}

// BuildPod a pod config
func BuildPod(ctx context.Context, podTemplateSpec corev1.PodTemplateSpec, rayNodeType rayv1.RayNodeType, rayStartParams map[string]string, headPort string, enableRayAutoscaler *bool, creatorCRDType utils.CRDType, fqdnRayIP string) (aPod corev1.Pod) {
func BuildPod(ctx context.Context, podTemplateSpec corev1.PodTemplateSpec, rayNodeType rayv1.RayNodeType, gcsOptions *rayv1.GcsFaultToleranceOptions, rayStartParams map[string]string, headPort string, enableRayAutoscaler *bool, creatorCRDType utils.CRDType, fqdnRayIP string) (aPod corev1.Pod) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just add a new gcsOptions parameter and pass it to the later setContainerEnvVars function.

Comment on lines +639 to +646
if pod.Annotations[utils.RayFTEnabledAnnotationKey] != "true" {
t.Fatalf("Ray pod has unexpected %s annotation: %v", utils.RayFTEnabledAnnotationKey, pod.Annotations[utils.RayFTEnabledAnnotationKey])
}
if pod.Annotations[utils.RayExternalStorageNSAnnotationKey] != string(cluster.UID) {
t.Fatalf("Ray pod has unexpected %s annotation: %v", utils.RayExternalStorageNSAnnotationKey, pod.Annotations[utils.RayExternalStorageNSAnnotationKey])
}
checkContainerEnv(t, rayContainer, utils.RAY_REDIS_ADDRESS, "redis://127.0.0.1:6379")
checkContainerEnv(t, rayContainer, utils.RAY_EXTERNAL_STORAGE_NS, string(cluster.UID))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure both annotations and envs are set.

@kevin85421
Copy link
Member

@MortalHappiness would you mind also reviewing this PR?

…ESS env by the GcsFaultToleranceOptions

Signed-off-by: Rueian <[email protected]>
Copy link
Member

@MortalHappiness MortalHappiness left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@kevin85421 kevin85421 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@kevin85421 kevin85421 merged commit 82e2554 into ray-project:master Jan 9, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RayCluster][Feature] add GcsFaultToleranceOptions to the RayCluster CRD [2/N]
3 participants