[cna] CNA support for external-resources #3056

geoberle · 2022-12-09T12:16:14Z

the new cna integration introduces the external-resources extension for CNA

It covers the following modules

null - for simple testing purposes without any prereqs on AWS accounts or clusters etc.
aws-assume-role - for testing AWS assume-role access by CNA
aws-rds - create an AWS RDS database

General approach

This integration is, as many others, a translation engine from one configuration language (app-interface external resource) to another one (CNA API + modules).

A cloud native asset in CNA is represented by a module and a module has various parameters that need to be provided when creating/updating an asset. A module is e.g. an RDS database. Additionally each asset has a name (technically the name is optional but this integration uses the name to know which asset in CNA belongs to witch external resource in app-interface).

A CNA module along with their parameters are represented in qontract-schema within the /cna/asset-1.yml schema and the CNAsset_v1 GQL type. The parameters are not necessarily mapped 1-1. Not all parameters in CNA might be relevant to be specified in app-interface because they might be provided by context (e.g. the region for a RDS can be inferred by the VPC) or might be provided as defaults within the code. Additionally each asset in qontract-schema has a dedicated datafile schema for defaults and provides override options for the defaults. As such the CNA external-resource schema aligns with what terraform-resources and cloudflare-resources already do (except for the fact that currently we mostly use resource files without schemas for defaults).

From a high level perspective, the CNA integration just translates a-i config objects to CNA asset API calls. Looking deeper it also handles the classical desired state (app-interface config) to current state (CNA world state) comparison and reconciliation.

Unlike the external-resource provisioners in a-i, the CNA provider does not come with a dry-run option because the CNA API does not support a dry-run. Therefore a lot of effort went into validating the data passed to CNA as good as possible by doing the following things:

have thorough schema, even for defaults
have the schema represented as dataclasses that are fully type hinted
have CNA modules as dataclasses that are fully type hinted
module based tests for the object translation process
validate configuration for consistency in code where not possible in the schema

This is for sure not sufficient to ensure every asset can be created without issues by CNA but it is as good as we can check without a dry-run and without inspecting AWS account prereqs etc. upfront.

How to configure app-interface for CNA

Provisioner

cna-experimental serves as the provisioner, indicating the experimental nature of the integration and the CNAExperimentalProvisioner_v1 serves as the provisioner, wrapping an OCM organization and potentially adding additional config data (we will add a API URL override to this soon so we can target CNA API instances living outside of api.openshift.com)

$schema: /cna/experimental-provisioner-1.yml

name: app-sre
description: CNA provisioner for the app-sre OCM organization

ocm:
  $ref: /dependencies/ocm/stage.yml

<<<--- potential additional config options go here

Example RDS

The defaults files for CNA are now datafile schemas as well. As such they enable us to use references, e.g. a VPC

Here is the example of a defaults file for a production DB

$schema: /cna/aws-rds-config-1.yml
engine: postgres
username: postgres
engine_version: '14.2'
instance_class: db.m6.large
backup_retention_period: 7
db_subnet_group_name: default
allocated_storage: 200
max_allocated_storage: 1000
multi_az: true
vpc:
  $ref: /aws/some-account/vpcs/default-vpc.yml

and the external resource declaration that leverages it

$schema: /openshift/namespace-1.yml
...
externalResources:
- provider: cna-experimental 
  provisioner:
    $ref: /cna/app-sre.yml
  resources:
  - provider: aws-rds
    identifier: my-db
    defaults:
      $ref: the-defaults-file.yml
    overrides:
      backup_retention_period: 14

design doc: https://gitlab.cee.redhat.com/service/app-interface/-/merge_requests/53097
ref: https://issues.redhat.com/browse/APPSRE-6295
schema: app-sre/qontract-schemas#355

steveteahan

Just a couple high-level comments to start. I'll pick back up with reviewing the rest tomorrow.

reconcile/cna/assets/aws_rds.py

steveteahan · 2022-12-13T17:14:45Z

reconcile/cna/assets/aws_rds.py

+    deletion_protection: Optional[bool] = Field(None, alias="deletion_protection")
+    apply_immediately: Optional[bool] = Field(None, alias="apply_immediately")
+
+    # Those values are implicit and not set in app-interface


What does this mean?

I see later we set:

is_production=True,

This is my fault for not yet getting to improving the documentation of the module with a proper README, but I did try my best to document the intent here:

Indicates whether the resource is a production resource. Setting this value will result in many defaults being set for you without needing to specify them.
https://gitlab.cee.redhat.com/service/cna-modules/-/blob/main/modules/aws-rds/variables.tf#L25

In theory, you could have a defaults file that sets is_production only and avoid needing to have sane deaults for instance class, multi-az, storage, etc. Eventually some of those values will be overridden by teams, but the simplest use case would ignore them all initially.

i agree that is_production is a good thing for direct users of CNA. when we talk about app-interface, where we also have defaults and overrides, it can get messy to know where the actual settings for an RDS instance are coming from.

i would tend to ignore the is_production settings, providing an arbitrary value for it (in this case true), while at the same time making all parameters in the defaults schema mandatory so looking at app-interface is self-sufficient to get the full picture about configuration settings. we can provide meaningful presets as referegcable defaults datafiles.

wdyt?

I think that the usage of defaults in app-interface isn't great today. Teams tend to copy them without much thought. The values are only as good as the person who created the file. The defaults provided by is_production are better in my (slightly biased) opinion because they are opinionated, and allow for the ability to quickly determine where teams have deviated from these default values.

I look at it like:

is_production provides your sane defaults for most use cases

defaults allows teams to avoid redundant configurations across services

The issue of where values come from is going to be present whether we hide is_production or not.

The module documentation might help clarify things a bit as well.

steveteahan

Overall, it looks great. Thank you for the effort on this!

steveteahan · 2022-12-19T15:08:58Z

reconcile/cna/client.py

+    def _init_metadata(self) -> dict[AssetType, AssetTypeMetadata]:
+        asset_types_metadata: dict[AssetType, AssetTypeMetadata] = {}
+        for asset_type_ref in self._ocm_client.get(
+            api_path="/api/cna-management/v1/asset_types"


nit: could we dedupe a lot of these URLs with something like {CNA_API_V1}/asset_types?

good point. fixed in a577a1c

reconcile/cna/client.py

reconcile/cna/integration.py

steveteahan · 2022-12-19T15:32:19Z

reconcile/test/cna/test_client.py

+
+
+def test_client_asset_type_metadata_init():
+    pass


Placeholder?

@fishi0x01 and i were discussing if we should have some hard validation of the metadata provided by the CNA module metadata endpoint with what we know about the modules in code. this is the placeholder test so we don't forget :)

Worth a docstring or using xfail to track?

steveteahan · 2022-12-19T15:34:46Z

reconcile/test/cna/test_client.py

+    ]
+
+    mocker.patch.object(CNAClient, "list_assets", return_value=listed_assets)
+    cna_client = CNAClient(None)  # type: ignore


nit: would create_autospec(OCMBaseClient) work here?

i autospeced the patch only

mocker.patch.object(CNAClient, "list_assets", return_value=listed_assets, autospec=True)

i verified that this is sufficient to ensure that list_assets_for_creator calls listed_assets correctly.

is this ok with you or do you prefer create_autospec?

I was thinking more of the fact that we're passing None in and need to ignore a type error? If we used create_autospec as a Mock factory, then I don't think you'd need to do this?

steveteahan · 2022-12-19T15:35:10Z

reconcile/test/cna/test_client.py

+        },
+    ]
+
+    mocker.patch.object(CNAClient, "list_assets", return_value=listed_assets)


nit: reminder on autospec usage where it makes sense (there could be a good reason not to here).

maybe covered in the other autospec related comment?

…nt (#2927) this implements the example-aws-assumerole CNA type to test assume role functionality. the role ARNs are stored in the aws account Signed-off-by: Gerd Oberlechner <[email protected]>

…2932) * rely on asset field metadata for CNA API and asset class conversion * use registration process to make asset classes, their provider and CNA kind known to the integration * use pydantic `alias` fields to map the difference from CNA api and python dataclasses * drive dataclass<->API conversion through this metadata * load CNA metadata into the client so we can test for dataclass<->API compatibility * renamed kind to asset_type (kind means something different in the CNA API) * creator filtering

[CNA] add support for aws-rds CNA module

a minor change: this way the we can use the same `client` to fetch the `bindings` we used to fetch the actual assets. Signed-off-by: Gerd Oberlechner <[email protected]>

all CNA assets must have a `defaults` and an `overrides` section, as a lot of the terraform-resources do. the difference is, that each CNA type declares a special `XXXConfig_v1` type that is used for both fields. ```yaml name: CNARDSInstance_v1 interface: CNAsset_v1 fields: - { name: provider, type: string, isRequired: true } - { name: identifier, type: string, isRequired: true, isUnique: true } - { name: name, type: string } - { name: defaults, type: CNARDSInstanceConfig_v1 } <-- - { name: overrides, type: CNARDSInstanceConfig_v1 } <-- ``` having defaults following a strict schema and overrides being able to override all of the defaults, makes writing testable and verifiable code a lot easier.

revert experiment where overrides and defaults are exactly the same schema. the result was that all fields in overrides and defaults needed to be optional so they can be used in both places without making overrides mandatory. in this PR, overrides are now all optional, follow a schema in jsonschema but none in GQL where they are just JSON

Signed-off-by: Gerd Oberlechner <[email protected]>

the changes have been extracted from the cna-integration branch into this PR #2990

Signed-off-by: Gerd Oberlechner <[email protected]>

steveteahan · 2022-12-20T15:52:40Z

reconcile/test/cna/test_integration.py

-    cna_clients["test"].list_assets.side_effect = [listed_assets]  # type: ignore
-    integration = CNAIntegration(cna_clients=cna_clients, namespaces=[])
+    mocker.patch.object(
+        CNAClient, "list_assets", create_autospec=True, return_value=listed_assets


create_autospec is a method whereas autospec should be the option to enabling autospeccing for the object created by the patch. I'm actually a bit surprised that no errors are thrown by this unless I'm missing something?

I confirmed with patch that create_autospec doesn't do what we expected, while autospec does:

>>> from unittest.mock import patch >>> def test(a): ... print(a) ... # In this test too many arguments are passed, which should result in an error, but it works just fine >>> with patch('__main__.test', return_value="test", create_autospec=True) as patch_test: ... print(test('too', 'many', 'args')) ... test # With autospec=True we get the expected TypeError >>> with patch('__main__.test', return_value="test", autospec=True) as patch_test: ... print(test('too', 'many', 'args')) ... Traceback (most recent call last): File "<stdin>", line 2, in <module> File "<string>", line 2, in test File "/usr/lib64/python3.9/unittest/mock.py", line 180, in checksig sig.bind(*args, **kwargs) File "/usr/lib64/python3.9/inspect.py", line 3043, in bind return self._bind(args, kwargs) File "/usr/lib64/python3.9/inspect.py", line 2964, in _bind raise TypeError('too many positional arguments') from None TypeError: too many positional arguments

geoberle force-pushed the cna-integration branch 2 times, most recently from 94eb215 to 3556ae5 Compare December 13, 2022 14:34

steveteahan reviewed Dec 13, 2022

View reviewed changes

steveteahan reviewed Dec 19, 2022

View reviewed changes

geoberle and others added 18 commits December 20, 2022 08:55

[CNA] assume role test resource and role ARN retrieval from AWS accou…

2abd93c

…nt (#2927) this implements the example-aws-assumerole CNA type to test assume role functionality. the role ARNs are stored in the aws account Signed-off-by: Gerd Oberlechner <[email protected]>

[CNA] add support for aws-rds CNA module (#2936)

ed55a36

[CNA] add support for aws-rds CNA module

runtime_checkable for Namespace (#2960)

6dc3188

CNA bindings (#2962)

956b1c1

move CNA binding fetching into the asset creation loop (#2969)

726694e

a minor change: this way the we can use the same `client` to fetch the `bindings` we used to fetch the actual assets. Signed-off-by: Gerd Oberlechner <[email protected]>

set secret_name (#2976)

8b72f23

allow bool as asset type variable (#3005)

fabf67e

allow Error state (#3006)

62a1647

fix import issues due to rebase

4633d93

Signed-off-by: Gerd Oberlechner <[email protected]>

[cna] revert externalresourcespec (#3018)

b863ced

the changes have been extracted from the cna-integration branch into this PR #2990

make RDS work (#3022)

4876411

reformat and lint after rebase

935f0ea

Signed-off-by: Gerd Oberlechner <[email protected]>

running qenerate one more time

e70a1dd

Signed-off-by: Gerd Oberlechner <[email protected]>

dedup CNA API URL prefix

a577a1c

Signed-off-by: Gerd Oberlechner <[email protected]>

clarify failure behavior on CNA API errors

443690b

Signed-off-by: Gerd Oberlechner <[email protected]>

geoberle force-pushed the cna-integration branch from 3556ae5 to 443690b Compare December 20, 2022 07:56

geoberle added 2 commits December 20, 2022 09:08

removed log statement von bind dry-run

408a355

Signed-off-by: Gerd Oberlechner <[email protected]>

autospec the patched list_asset function

727282a

Signed-off-by: Gerd Oberlechner <[email protected]>

steveteahan reviewed Dec 20, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[cna] CNA support for external-resources #3056

[cna] CNA support for external-resources #3056

geoberle commented Dec 9, 2022 •

edited

Loading

steveteahan left a comment

steveteahan Dec 13, 2022

geoberle Dec 20, 2022

steveteahan Dec 20, 2022

steveteahan left a comment

steveteahan Dec 19, 2022

geoberle Dec 20, 2022

steveteahan Dec 19, 2022

geoberle Dec 20, 2022

steveteahan Dec 20, 2022

steveteahan Dec 19, 2022

geoberle Dec 20, 2022

steveteahan Dec 20, 2022

steveteahan Dec 19, 2022

geoberle Dec 20, 2022

steveteahan Dec 20, 2022

[cna] CNA support for external-resources #3056

Are you sure you want to change the base?

[cna] CNA support for external-resources #3056

Conversation

geoberle commented Dec 9, 2022 • edited Loading

General approach

How to configure app-interface for CNA

Provisioner

Example RDS

steveteahan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

steveteahan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

geoberle commented Dec 9, 2022 •

edited

Loading