Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

custom: add support for custom container #84

Merged
merged 3 commits into from
Sep 24, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
update
Signed-off-by: vsoch <vsoch@users.noreply.github.com>
vsoch committed Sep 24, 2024
commit 494517737acf3a0f743a0fa4f5b47703ccd0bf85
13 changes: 8 additions & 5 deletions api/v1alpha2/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion config/manager/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -5,4 +5,4 @@ kind: Kustomization
images:
- name: controller
newName: ghcr.io/converged-computing/metrics-operator
newTag: test
newTag: latest
7 changes: 7 additions & 0 deletions docs/_static/data/metrics.json
Original file line number Diff line number Diff line change
@@ -20,6 +20,13 @@
"image": "ghcr.io/converged-computing/metric-cabanapic:latest",
"url": "https://github.com/ECP-copa/CabanaPIC"
},
{
"name": "app-custom",
"description": "Provide a custom application for MPI trace",
"family": "proxyapp",
"image": "",
"url": "https://converged-computing.github.io/metrics-operator"
},
{
"name": "app-hpl",
"description": "High-Performance Linpack (HPL)",
3 changes: 3 additions & 0 deletions docs/getting_started/metrics.md
Original file line number Diff line number Diff line change
@@ -288,6 +288,9 @@ spec:
mount: /opt/mnt
image: ghcr.io/converged-computing/metric-mpitrace:ubuntu-jammy
workdir: <workdir>
# this is the target of the replicated job "l" means launcher
target: l
# This is the target container, with full name "launcher"
containerTarget: launcher
```

47 changes: 35 additions & 12 deletions examples/dist/metrics-operator-arm.yaml
Original file line number Diff line number Diff line change
@@ -15,8 +15,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.11.1
creationTimestamp: null
controller-gen.kubebuilder.io/version: v0.14.0
name: metricsets.flux-framework.org
spec:
group: flux-framework.org
@@ -33,10 +32,19 @@ spec:
description: MetricSet is the Schema for the metrics API
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
description: |-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type: string
kind:
description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
description: |-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type: string
metadata:
type: object
@@ -45,27 +53,39 @@ spec:
properties:
deadlineSeconds:
default: 31500000
description: Should the job be limited to a particular number of seconds? Approximately one year. This cannot be zero or job won't start
description: |-
Should the job be limited to a particular number of seconds?
Approximately one year. This cannot be zero or job won't start
format: int64
type: integer
dontSetFQDN:
description: Don't set JobSet FQDN
type: boolean
logging:
description: Logging spec, preparing for other kinds of logging Right now we just include an interactive option
description: |-
Logging spec, preparing for other kinds of logging
Right now we just include an interactive option
properties:
interactive:
description: Don't allow the application, metric, or storage test to finish This adds sleep infinity at the end to allow for interactive mode.
description: |-
Don't allow the application, metric, or storage test to finish
This adds sleep infinity at the end to allow for interactive mode.
type: boolean
type: object
metrics:
description: The name of the metric (that will be associated with a flavor like storage)
items:
properties:
addons:
description: A Metric addon can be storage (volume) or an application, It's an additional entity that can customize a replicated job, either adding assets / features or entire containers to the pod
description: |-
A Metric addon can be storage (volume) or an application,
It's an additional entity that can customize a replicated job,
either adding assets / features or entire containers to the pod
items:
description: 'A Metric addon is an interface that exposes extra volumes for a metric. Examples include: A storage volume to be mounted on one or more of the replicated jobs A single application container.'
description: |-
A Metric addon is an interface that exposes extra volumes for a metric. Examples include:
A storage volume to be mounted on one or more of the replicated jobs
A single application container.
properties:
listOptions:
additionalProperties:
@@ -126,7 +146,9 @@ spec:
- type: string
x-kubernetes-int-or-string: true
type: array
description: Metric List Options Metric specific options
description: |-
Metric List Options
Metric specific options
type: object
mapOptions:
additionalProperties:
@@ -146,7 +168,9 @@ spec:
- type: integer
- type: string
x-kubernetes-int-or-string: true
description: Metric Options Metric specific options
description: |-
Metric Options
Metric specific options
type: object
resources:
description: Resources include limits and requests for the metric container
@@ -280,7 +304,6 @@ rules:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
creationTimestamp: null
name: metrics-manager-role
rules:
- apiGroups:
47 changes: 35 additions & 12 deletions examples/dist/metrics-operator.yaml
Original file line number Diff line number Diff line change
@@ -15,8 +15,7 @@ apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.11.1
creationTimestamp: null
controller-gen.kubebuilder.io/version: v0.14.0
name: metricsets.flux-framework.org
spec:
group: flux-framework.org
@@ -33,10 +32,19 @@ spec:
description: MetricSet is the Schema for the metrics API
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
description: |-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type: string
kind:
description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
description: |-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type: string
metadata:
type: object
@@ -45,27 +53,39 @@ spec:
properties:
deadlineSeconds:
default: 31500000
description: Should the job be limited to a particular number of seconds? Approximately one year. This cannot be zero or job won't start
description: |-
Should the job be limited to a particular number of seconds?
Approximately one year. This cannot be zero or job won't start
format: int64
type: integer
dontSetFQDN:
description: Don't set JobSet FQDN
type: boolean
logging:
description: Logging spec, preparing for other kinds of logging Right now we just include an interactive option
description: |-
Logging spec, preparing for other kinds of logging
Right now we just include an interactive option
properties:
interactive:
description: Don't allow the application, metric, or storage test to finish This adds sleep infinity at the end to allow for interactive mode.
description: |-
Don't allow the application, metric, or storage test to finish
This adds sleep infinity at the end to allow for interactive mode.
type: boolean
type: object
metrics:
description: The name of the metric (that will be associated with a flavor like storage)
items:
properties:
addons:
description: A Metric addon can be storage (volume) or an application, It's an additional entity that can customize a replicated job, either adding assets / features or entire containers to the pod
description: |-
A Metric addon can be storage (volume) or an application,
It's an additional entity that can customize a replicated job,
either adding assets / features or entire containers to the pod
items:
description: 'A Metric addon is an interface that exposes extra volumes for a metric. Examples include: A storage volume to be mounted on one or more of the replicated jobs A single application container.'
description: |-
A Metric addon is an interface that exposes extra volumes for a metric. Examples include:
A storage volume to be mounted on one or more of the replicated jobs
A single application container.
properties:
listOptions:
additionalProperties:
@@ -126,7 +146,9 @@ spec:
- type: string
x-kubernetes-int-or-string: true
type: array
description: Metric List Options Metric specific options
description: |-
Metric List Options
Metric specific options
type: object
mapOptions:
additionalProperties:
@@ -146,7 +168,9 @@ spec:
- type: integer
- type: string
x-kubernetes-int-or-string: true
description: Metric Options Metric specific options
description: |-
Metric Options
Metric specific options
type: object
resources:
description: Resources include limits and requests for the metric container
@@ -280,7 +304,6 @@ rules:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
creationTimestamp: null
name: metrics-manager-role
rules:
- apiGroups:
2 changes: 1 addition & 1 deletion pkg/addons/mpitrace.go
Original file line number Diff line number Diff line change
@@ -96,7 +96,7 @@ func (a *MPITrace) CustomizeEntrypoints(
logger.Infof("🟧️ Customizing entrypoints for %s\n", rjs)

for _, rj := range rjs {
logger.Infof("🟧️ Comparing %s vs %s\n", a.target, rj.Name)
logger.Infof("🟧️ Comparing job target %s vs job name %s\n", a.target, rj.Name)

// Only customize if the replicated job name matches the target
if a.target != "" && a.target != rj.Name {
3 changes: 0 additions & 3 deletions pkg/metrics/base.go
Original file line number Diff line number Diff line change
@@ -8,8 +8,6 @@ SPDX-License-Identifier: MIT
package metrics

import (
"fmt"

api "github.com/converged-computing/metrics-operator/api/v1alpha2"
"github.com/converged-computing/metrics-operator/pkg/addons"
"github.com/converged-computing/metrics-operator/pkg/specs"
@@ -174,7 +172,6 @@ func (m BaseMetric) AddAddons(
addonContainers = append(addonContainers, assembleContainer)
}

fmt.Println(addonContainers)
// Allow the addons to customize the container entrypoints, specific to the job name
// It's important that this set does not include other addon container specs
a.CustomizeEntrypoints(containerSpecs, rjs)