Skip to content

Latest commit

 

History

History
912 lines (672 loc) · 27.2 KB

README.adoc

File metadata and controls

912 lines (672 loc) · 27.2 KB

devops-stack-module-longhorn

A DevOps Stack module to deploy and configure Longhorn.

The Longhorn chart used by this module is shipped in this repository as well, in order to avoid any unwanted behaviors caused by unsupported versions.

Current Chart Version Original Repository Default Values

1.7.0

Chart

values.yaml

Important
For the moment, this module only supports the deployment of Longhorn in SKS clusters.

Usage

A simple declaration of the module would look like this:

module "longhorn" {
  source = "git::https://github.com/camptocamp/devops-stack-module-longhorn.git?ref=<RELEASE>"

  cluster_name     = module.sks.cluster_name
  base_domain      = module.sks.base_domain
  cluster_issuer   = local.cluster_issuer
  argocd_namespace = module.argocd_bootstrap.argocd_namespace

  dependency_ids = {
    argocd = module.argocd_bootstrap.id
  }
}

You can enable the ingress to the Longhorn Dashboard. In that case, you will need to enable the respective flag and pass along the required OIDC configuration:

module "longhorn" {
  source = "git::https://github.com/camptocamp/devops-stack-module-longhorn.git?ref=<RELEASE>"

  cluster_name     = module.sks.cluster_name
  base_domain      = module.sks.base_domain
  cluster_issuer   = local.cluster_issuer
  argocd_namespace = module.argocd_bootstrap.argocd_namespace

  enable_dashboard_ingress = true
  oidc                     = module.oidc.oidc

  dependency_ids = {
    argocd = module.argocd_bootstrap.id
    traefik      = module.traefik.id
    cert-manager = module.cert-manager.id
    keycloak     = module.keycloak.id
    oidc         = module.oidc.id
  }
Note
The previous example uses Keycloak as an OIDC provider, but you can use any other you want.

In case you want to backup the content of the persistent volumes, you have the possibility of enabling the backup feature. In that case, you will need to enable the respective flag and pass along the require S3 configuration:

module "longhorn" {
  source = "git::https://github.com/camptocamp/devops-stack-module-longhorn.git?ref=<RELEASE>"

  cluster_name     = module.sks.cluster_name
  base_domain      = module.sks.base_domain
  cluster_issuer   = local.cluster_issuer
  argocd_namespace = module.argocd_bootstrap.argocd_namespace

  enable_dashboard_ingress = true
  oidc                     = module.oidc.oidc

  enable_pv_backups = true
  backup_storage = {
    bucket_name = resource.aws_s3_bucket.this["longhorn"].id
    region      = resource.aws_s3_bucket.this["longhorn"].region
    endpoint    = "sos-${resource.aws_s3_bucket.this["longhorn"].region}.exo.io"
    access_key  = resource.exoscale_iam_access_key.s3_iam_key["longhorn"].key
    secret_key  = resource.exoscale_iam_access_key.s3_iam_key["longhorn"].secret
  }

  dependency_ids = {
    argocd = module.argocd_bootstrap.id
    traefik      = module.traefik.id
    cert-manager = module.cert-manager.id
    keycloak     = module.keycloak.id
    oidc         = module.oidc.id
  }
Important
You are in charge of creating the S3 bucket to store the PV backups. We’ve decided to keep the creation of this bucket outside of this module, mainly because the persistence of the data should not be related to the instantiation of the module itself.
Tip
Check the SKS deployment example to see how to create the S3 bucket and to better understand the values passed on the example above.
Tip
On the technical reference below you will find further customization options, such as the backup/snapshot schedule.

If there is a need to configure something besides the common settings that we have provided, you can customize the chart’s values.yaml by adding an Helm configuration as an HCL structure:

module "longhorn" {
  source = "git::https://github.com/camptocamp/devops-stack-module-longhorn.git?ref=<RELEASE>"

  cluster_name     = module.sks.cluster_name
  base_domain      = module.sks.base_domain
  cluster_issuer   = local.cluster_issuer
  argocd_namespace = module.argocd_bootstrap.argocd_namespace

  enable_dashboard_ingress = true
  oidc                     = module.oidc.oidc

  enable_pv_backups = true
  backup_storage = {
    bucket_name = resource.aws_s3_bucket.this["longhorn"].id
    region      = resource.aws_s3_bucket.this["longhorn"].region
    endpoint    = "sos-${resource.aws_s3_bucket.this["longhorn"].region}.exo.io"
    access_key  = resource.exoscale_iam_access_key.s3_iam_key["longhorn"].key
    secret_key  = resource.exoscale_iam_access_key.s3_iam_key["longhorn"].secret
  }

  helm_values = [{ # Note the curly brackets here
    longhorn = {
      map = {
        string = "string"
        bool   = true
      }
      sequence = [
        {
          key1 = "value1"
          key2 = "value2"
        },
        {
          key1 = "value1"
          key2 = "value2"
        },
      ]
      sequence2 = [
        "string1",
        "string2"
      ]
    }
  }]

  dependency_ids = {
    argocd       = module.argocd_bootstrap.id
    traefik      = module.traefik.id
    cert-manager = module.cert-manager.id
    keycloak     = module.keycloak.id
    oidc         = module.oidc.id
  }

OIDC

There is an OAuth2-Proxy container deployed along with the Longhorn dashboard. Consequently, the oidc variable is expected to have at least the Issuer URL, the Client ID, and the Client Secret.

You can pass these values by pointing an output from another module (as above), or by defining them explicitly:

module "longhorn" {
  ...
  oidc = {
    issuer_url    = "<URL>"
    client_id     = "<ID>"
    client_secret = "<SECRET>"
  }
  ...
}

Restoring volume backups

  1. If your pod and his volume are still up, start by shuting down the pod (be careful to also stop the Deployment/StatefulSet) and delete the volume using the Longhorn Dashboard.

  2. Go to the backup tab of Longhorn Dashboard and restore the desired volume backup. You must check the Use Previous Name checkbox in order to keep the old volume name.

  3. Next, go to the volume tab, select your newly restored volume and choose Create PV/PVC option. Select Use Previous PVC option and validate.

  4. You can now restore your application, which should attach the restored volume automatically.

Technical Reference

Dependencies

module.argocd_bootstrap.id

This module must be one of the first ones to be deployed, since other modules require Persistent Volumes. Consequently it needs to be deployed right after the module argocd_bootstrap. This is the only dependency that is not optional.

module.traefik.id and module.cert-manager.id

When enabling the ingress for the Longhorn Dashboard, you need to add Traefik and cert-manager as dependencies.

module.keycloak.id and module.oidc.id

When using Keycloak as an OIDC provider for the Longhorn Dashboard, you need to add Keycloak and the OIDC module as dependencies.

Requirements

The following requirements are needed by this module:

Providers

The following providers are used by this module:

Resources

The following resources are used by this module:

Optional Inputs

The following input variables are optional (have default values):

Description: Name given to the cluster. Value used for naming some the resources created by the module.

Type: string

Default: "cluster"

Description: Base domain of the cluster. Value used for the ingress' URL of the application.

Type: string

Default: null

Description: Subdomain of the cluster. Value used for the ingress' URL of the application.

Type: string

Default: "apps"

Description: SSL certificate issuer to use. Usually you would configure this value as letsencrypt-staging or letsencrypt-prod on your root *.tf files.

Type: string

Default: "selfsigned-issuer"

Description: Name of the Argo CD AppProject where the Application should be created. If not set, the Application will be created in a new AppProject only for this Application.

Type: string

Default: null

Description: Labels to attach to the Argo CD Application resource.

Type: map(string)

Default: {}

Description: Destination cluster where the application should be deployed.

Type: string

Default: "in-cluster"

Description: Override of target revision of the application chart.

Type: string

Default: "v4.0.0"

Description: Helm chart value overrides. They should be passed as a list of HCL structures.

Type: any

Default: []

Description: Automated sync options for the Argo CD Application resource.

Type:

object({
    allow_empty = optional(bool)
    prune       = optional(bool)
    self_heal   = optional(bool)
  })

Default:

{
  "allow_empty": false,
  "prune": true,
  "self_heal": true
}

Description: IDs of the other modules on which this module depends on.

Type: map(string)

Default: {}

Description: Set the storage over-provisioning percentage. This values should be modified only when really needed.

Type: number

Default: 100

Description: Set the minimal available storage percentage. This values should be modified only when really needed. The default is 25%, as recommended in the best practices for single-disk nodes.

Type: number

Default: 25

Description: Boolean to enable backups of Longhorn volumes to an external object storage.

Type: bool

Default: false

Description: Boolean to set the Storage Class with the backup configuration as the default for all Persistent Volumes.

Type: bool

Default: true

Description: Exoscale SOS bucket configuration where the backups will be stored. This configuration is required if the variable enable_pv_backups is set to true.

Type:

object({
    bucket_name = string
    region      = string
    endpoint    = string
    access_key  = string
    secret_key  = string
  })

Default: null

Description: The following values can be configured: . snapshot_enabled - Enable Longhorn automatic snapshots. . snapshot_cron - Cron schedule to configure Longhorn automatic snapshots. . snapshot_retention - Retention of Longhorn automatic snapshots in days. . backup_enabled - Enable Longhorn automatic backups to object storage. . backup_cron - Cron schedule to configure Longhorn automatic backups. . backup_retention - Retention of Longhorn automatic backups in days.

/!\ These settings cannot be changed after StorageClass creation without having to recreate it!

Type:

object({
    snapshot_enabled   = bool
    snapshot_cron      = string
    snapshot_retention = number
    backup_enabled     = bool
    backup_cron        = string
    backup_retention   = number
  })

Default:

{
  "backup_cron": "30 */12 * * *",
  "backup_enabled": false,
  "backup_retention": "2",
  "snapshot_cron": "0 */2 * * *",
  "snapshot_enabled": false,
  "snapshot_retention": "1"
}

Description: Boolean to enable the pre-upgrade check. Usually this value should be set to true and only set to false if you are bootstrapping a new cluster, otherwise the first deployment will not work.

Type: bool

Default: true

Description: Boolean to enable the deployment of a service monitor.

Type: bool

Default: false

Description: Additional labels to add to Longhorn alerts.

Type: map(string)

Default: {}

Description: Boolean to enable the creation of an ingress for the Longhorn’s dashboard. If enabled, you must provide a value for base_domain.

Type: bool

Default: false

Description: Boolean to enable the provisioning of a Longhorn dashboard for Grafana.

Type: bool

Default: true

Description: OIDC settings to configure OAuth2-Proxy which will be used to protect Longhorn’s dashboard.

Type:

object({
    issuer_url              = string
    oauth_url               = optional(string, "")
    token_url               = optional(string, "")
    api_url                 = optional(string, "")
    client_id               = string
    client_secret           = string
    oauth2_proxy_extra_args = optional(list(string), [])
  })

Default: null

Description: Settings to enable and configure automatic filesystem trim of volumes managed by Longhorn.

Type:

object({
    enabled   = bool
    cron      = string
    job_group = string
  })

Default:

{
  "cron": "0 6 * * *",
  "enabled": false,
  "job_group": ""
}

Description: Define a group list to add to recurring job selector for the default storage class (the custom backup one if set_default_storage_class is set or else the Longhorn default one).

Type:

list(object({
    name    = string
    isGroup = bool
  }))

Default: null

Description: Amount of replicas created by Longhorn for each volume.

Type: number

Default: 2

Description: Tolerations to be added to the core Longhorn components that manage storage on nodes. These tolerations are required if you want Longhorn to schedule storage on nodes that are tainted.

These settings only have an effect on the first deployment. If added at a later time, you need to also add them on the Settings tab in the Longhorn Dashboard. Check the official documentation for more detailed information.

Only tolerations with the "Equal" operator are supported, because the Longhorn Helm chart expects a parsed list as a string in the defaultSettings.taintToleration value.

Type:

list(object({
    key      = string
    operator = string
    value    = string
    effect   = string
  }))

Default: []

Outputs

The following outputs are exported:

Description: ID to pass other modules in order to refer to this module as a dependency.

Reference in table format

Show tables

= Requirements

Name Version

>= 6

>= 3

>= 1

= Providers

Name Version

>= 3

>= 6

>= 1

n/a

= Resources

Name Type

resource

resource

resource

resource

resource

data source

= Inputs

Name Description Type Default Required

Name given to the cluster. Value used for naming some the resources created by the module.

string

"cluster"

no

Base domain of the cluster. Value used for the ingress' URL of the application.

string

null

no

Subdomain of the cluster. Value used for the ingress' URL of the application.

string

"apps"

no

SSL certificate issuer to use. Usually you would configure this value as letsencrypt-staging or letsencrypt-prod on your root *.tf files.

string

"selfsigned-issuer"

no

Name of the Argo CD AppProject where the Application should be created. If not set, the Application will be created in a new AppProject only for this Application.

string

null

no

Labels to attach to the Argo CD Application resource.

map(string)

{}

no

Destination cluster where the application should be deployed.

string

"in-cluster"

no

Override of target revision of the application chart.

string

"v4.0.0"

no

Helm chart value overrides. They should be passed as a list of HCL structures.

any

[]

no

Automated sync options for the Argo CD Application resource.

object({
    allow_empty = optional(bool)
    prune       = optional(bool)
    self_heal   = optional(bool)
  })
{
  "allow_empty": false,
  "prune": true,
  "self_heal": true
}

no

IDs of the other modules on which this module depends on.

map(string)

{}

no

Set the storage over-provisioning percentage. This values should be modified only when really needed.

number

100

no

Set the minimal available storage percentage. This values should be modified only when really needed. The default is 25%, as recommended in the best practices for single-disk nodes.

number

25

no

Boolean to enable backups of Longhorn volumes to an external object storage.

bool

false

no

Boolean to set the Storage Class with the backup configuration as the default for all Persistent Volumes.

bool

true

no

Exoscale SOS bucket configuration where the backups will be stored. This configuration is required if the variable enable_pv_backups is set to true.

object({
    bucket_name = string
    region      = string
    endpoint    = string
    access_key  = string
    secret_key  = string
  })

null

no

The following values can be configured: . snapshot_enabled - Enable Longhorn automatic snapshots. . snapshot_cron - Cron schedule to configure Longhorn automatic snapshots. . snapshot_retention - Retention of Longhorn automatic snapshots in days. . backup_enabled - Enable Longhorn automatic backups to object storage. . backup_cron - Cron schedule to configure Longhorn automatic backups. . backup_retention - Retention of Longhorn automatic backups in days.

/!\ These settings cannot be changed after StorageClass creation without having to recreate it!

object({
    snapshot_enabled   = bool
    snapshot_cron      = string
    snapshot_retention = number
    backup_enabled     = bool
    backup_cron        = string
    backup_retention   = number
  })
{
  "backup_cron": "30 */12 * * *",
  "backup_enabled": false,
  "backup_retention": "2",
  "snapshot_cron": "0 */2 * * *",
  "snapshot_enabled": false,
  "snapshot_retention": "1"
}

no

Boolean to enable the pre-upgrade check. Usually this value should be set to true and only set to false if you are bootstrapping a new cluster, otherwise the first deployment will not work.

bool

true

no

Boolean to enable the deployment of a service monitor.

bool

false

no

Additional labels to add to Longhorn alerts.

map(string)

{}

no

Boolean to enable the creation of an ingress for the Longhorn’s dashboard. If enabled, you must provide a value for base_domain.

bool

false

no

Boolean to enable the provisioning of a Longhorn dashboard for Grafana.

bool

true

no

OIDC settings to configure OAuth2-Proxy which will be used to protect Longhorn’s dashboard.

object({
    issuer_url              = string
    oauth_url               = optional(string, "")
    token_url               = optional(string, "")
    api_url                 = optional(string, "")
    client_id               = string
    client_secret           = string
    oauth2_proxy_extra_args = optional(list(string), [])
  })

null

no

Settings to enable and configure automatic filesystem trim of volumes managed by Longhorn.

object({
    enabled   = bool
    cron      = string
    job_group = string
  })
{
  "cron": "0 6 * * *",
  "enabled": false,
  "job_group": ""
}

no

Define a group list to add to recurring job selector for the default storage class (the custom backup one if set_default_storage_class is set or else the Longhorn default one).

list(object({
    name    = string
    isGroup = bool
  }))

null

no

Amount of replicas created by Longhorn for each volume.

number

2

no

Tolerations to be added to the core Longhorn components that manage storage on nodes. These tolerations are required if you want Longhorn to schedule storage on nodes that are tainted.

These settings only have an effect on the first deployment. If added at a later time, you need to also add them on the Settings tab in the Longhorn Dashboard. Check the official documentation for more detailed information.

Only tolerations with the "Equal" operator are supported, because the Longhorn Helm chart expects a parsed list as a string in the defaultSettings.taintToleration value.

list(object({
    key      = string
    operator = string
    value    = string
    effect   = string
  }))

[]

no

= Outputs

Name Description

id

ID to pass other modules in order to refer to this module as a dependency.