Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cna] CNA support for external-resources #3056

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open

Conversation

geoberle
Copy link
Contributor

@geoberle geoberle commented Dec 9, 2022

the new cna integration introduces the external-resources extension for CNA

It covers the following modules

  • null - for simple testing purposes without any prereqs on AWS accounts or clusters etc.
  • aws-assume-role - for testing AWS assume-role access by CNA
  • aws-rds - create an AWS RDS database

General approach

This integration is, as many others, a translation engine from one configuration language (app-interface external resource) to another one (CNA API + modules).

A cloud native asset in CNA is represented by a module and a module has various parameters that need to be provided when creating/updating an asset. A module is e.g. an RDS database. Additionally each asset has a name (technically the name is optional but this integration uses the name to know which asset in CNA belongs to witch external resource in app-interface).

A CNA module along with their parameters are represented in qontract-schema within the /cna/asset-1.yml schema and the CNAsset_v1 GQL type. The parameters are not necessarily mapped 1-1. Not all parameters in CNA might be relevant to be specified in app-interface because they might be provided by context (e.g. the region for a RDS can be inferred by the VPC) or might be provided as defaults within the code. Additionally each asset in qontract-schema has a dedicated datafile schema for defaults and provides override options for the defaults. As such the CNA external-resource schema aligns with what terraform-resources and cloudflare-resources already do (except for the fact that currently we mostly use resource files without schemas for defaults).

From a high level perspective, the CNA integration just translates a-i config objects to CNA asset API calls. Looking deeper it also handles the classical desired state (app-interface config) to current state (CNA world state) comparison and reconciliation.

Unlike the external-resource provisioners in a-i, the CNA provider does not come with a dry-run option because the CNA API does not support a dry-run. Therefore a lot of effort went into validating the data passed to CNA as good as possible by doing the following things:

  • have thorough schema, even for defaults
  • have the schema represented as dataclasses that are fully type hinted
  • have CNA modules as dataclasses that are fully type hinted
  • module based tests for the object translation process
  • validate configuration for consistency in code where not possible in the schema

This is for sure not sufficient to ensure every asset can be created without issues by CNA but it is as good as we can check without a dry-run and without inspecting AWS account prereqs etc. upfront.

How to configure app-interface for CNA

Provisioner

cna-experimental serves as the provisioner, indicating the experimental nature of the integration and the CNAExperimentalProvisioner_v1 serves as the provisioner, wrapping an OCM organization and potentially adding additional config data (we will add a API URL override to this soon so we can target CNA API instances living outside of api.openshift.com)

$schema: /cna/experimental-provisioner-1.yml

name: app-sre
description: CNA provisioner for the app-sre OCM organization

ocm:
  $ref: /dependencies/ocm/stage.yml

<<<--- potential additional config options go here

Example RDS

The defaults files for CNA are now datafile schemas as well. As such they enable us to use references, e.g. a VPC

Here is the example of a defaults file for a production DB

$schema: /cna/aws-rds-config-1.yml
engine: postgres
username: postgres
engine_version: '14.2'
instance_class: db.m6.large
backup_retention_period: 7
db_subnet_group_name: default
allocated_storage: 200
max_allocated_storage: 1000
multi_az: true
vpc:
  $ref: /aws/some-account/vpcs/default-vpc.yml

and the external resource declaration that leverages it

$schema: /openshift/namespace-1.yml
...
externalResources:
- provider: cna-experimental 
  provisioner:
    $ref: /cna/app-sre.yml
  resources:
  - provider: aws-rds
    identifier: my-db
    defaults:
      $ref: the-defaults-file.yml
    overrides:
      backup_retention_period: 14

design doc: https://gitlab.cee.redhat.com/service/app-interface/-/merge_requests/53097
ref: https://issues.redhat.com/browse/APPSRE-6295
schema: app-sre/qontract-schemas#355

@geoberle geoberle force-pushed the cna-integration branch 2 times, most recently from 94eb215 to 3556ae5 Compare December 13, 2022 14:34
Copy link
Contributor

@steveteahan steveteahan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple high-level comments to start. I'll pick back up with reviewing the rest tomorrow.

reconcile/cna/assets/aws_rds.py Show resolved Hide resolved
deletion_protection: Optional[bool] = Field(None, alias="deletion_protection")
apply_immediately: Optional[bool] = Field(None, alias="apply_immediately")

# Those values are implicit and not set in app-interface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

I see later we set:

is_production=True,

This is my fault for not yet getting to improving the documentation of the module with a proper README, but I did try my best to document the intent here:

Indicates whether the resource is a production resource. Setting this value will result in many defaults being set for you without needing to specify them.
https://gitlab.cee.redhat.com/service/cna-modules/-/blob/main/modules/aws-rds/variables.tf#L25

In theory, you could have a defaults file that sets is_production only and avoid needing to have sane deaults for instance class, multi-az, storage, etc. Eventually some of those values will be overridden by teams, but the simplest use case would ignore them all initially.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree that is_production is a good thing for direct users of CNA. when we talk about app-interface, where we also have defaults and overrides, it can get messy to know where the actual settings for an RDS instance are coming from.

i would tend to ignore the is_production settings, providing an arbitrary value for it (in this case true), while at the same time making all parameters in the defaults schema mandatory so looking at app-interface is self-sufficient to get the full picture about configuration settings. we can provide meaningful presets as referegcable defaults datafiles.

wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the usage of defaults in app-interface isn't great today. Teams tend to copy them without much thought. The values are only as good as the person who created the file. The defaults provided by is_production are better in my (slightly biased) opinion because they are opinionated, and allow for the ability to quickly determine where teams have deviated from these default values.

I look at it like:

  • is_production provides your sane defaults for most use cases
  • defaults allows teams to avoid redundant configurations across services

The issue of where values come from is going to be present whether we hide is_production or not.

The module documentation might help clarify things a bit as well.

Copy link
Contributor

@steveteahan steveteahan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it looks great. Thank you for the effort on this!

def _init_metadata(self) -> dict[AssetType, AssetTypeMetadata]:
asset_types_metadata: dict[AssetType, AssetTypeMetadata] = {}
for asset_type_ref in self._ocm_client.get(
api_path="/api/cna-management/v1/asset_types"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we dedupe a lot of these URLs with something like {CNA_API_V1}/asset_types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. fixed in a577a1c

reconcile/cna/client.py Show resolved Hide resolved
reconcile/cna/client.py Show resolved Hide resolved
reconcile/cna/integration.py Show resolved Hide resolved


def test_client_asset_type_metadata_init():
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Placeholder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fishi0x01 and i were discussing if we should have some hard validation of the metadata provided by the CNA module metadata endpoint with what we know about the modules in code. this is the placeholder test so we don't forget :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth a docstring or using xfail to track?

]

mocker.patch.object(CNAClient, "list_assets", return_value=listed_assets)
cna_client = CNAClient(None) # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would create_autospec(OCMBaseClient) work here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i autospeced the patch only

mocker.patch.object(CNAClient, "list_assets", return_value=listed_assets, autospec=True)

i verified that this is sufficient to ensure that list_assets_for_creator calls listed_assets correctly.

is this ok with you or do you prefer create_autospec?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking more of the fact that we're passing None in and need to ignore a type error? If we used create_autospec as a Mock factory, then I don't think you'd need to do this?

},
]

mocker.patch.object(CNAClient, "list_assets", return_value=listed_assets)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: reminder on autospec usage where it makes sense (there could be a good reason not to here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe covered in the other autospec related comment?

geoberle and others added 18 commits December 20, 2022 08:55
…nt (#2927)

this implements the example-aws-assumerole CNA type to test assume role functionality.

the role ARNs are stored in the aws account

Signed-off-by: Gerd Oberlechner <[email protected]>
…2932)

* rely on asset field metadata for CNA API and asset class conversion
* use registration process to make asset classes, their provider and CNA kind known to the integration
* use pydantic `alias` fields to map the difference from CNA api and python dataclasses
* drive dataclass<->API conversion through this metadata
* load CNA metadata into the client so we can test for dataclass<->API compatibility
* renamed kind to asset_type (kind means something different in the CNA API)
* creator filtering
[CNA] add support for aws-rds CNA module
a minor change:
this way the we can use the same `client` to fetch the `bindings` we used to fetch the actual assets.

Signed-off-by: Gerd Oberlechner <[email protected]>
all CNA assets must have a `defaults` and an `overrides` section, as a lot of the terraform-resources do. the difference is, that each CNA type declares a special `XXXConfig_v1` type that is used for both fields.

```yaml
  name: CNARDSInstance_v1
  interface: CNAsset_v1
  fields:
  - { name: provider, type: string, isRequired: true }
  - { name: identifier, type: string, isRequired: true, isUnique: true }
  - { name: name, type: string }
  - { name: defaults, type: CNARDSInstanceConfig_v1 } <--
  - { name: overrides, type: CNARDSInstanceConfig_v1 } <--
```

having defaults following a strict schema and overrides being able to override all of the defaults, makes writing testable and verifiable code a lot easier.
revert experiment where overrides and defaults are exactly the same schema. the result was that all fields in overrides and defaults needed to be optional so they can be used in both places without making overrides mandatory.

in this PR, overrides are now all optional, follow a schema in jsonschema but none in GQL where they are just JSON
Signed-off-by: Gerd Oberlechner <[email protected]>
the changes have been extracted from the cna-integration branch into this PR

#2990
Signed-off-by: Gerd Oberlechner <[email protected]>
Signed-off-by: Gerd Oberlechner <[email protected]>
Signed-off-by: Gerd Oberlechner <[email protected]>
cna_clients["test"].list_assets.side_effect = [listed_assets] # type: ignore
integration = CNAIntegration(cna_clients=cna_clients, namespaces=[])
mocker.patch.object(
CNAClient, "list_assets", create_autospec=True, return_value=listed_assets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_autospec is a method whereas autospec should be the option to enabling autospeccing for the object created by the patch. I'm actually a bit surprised that no errors are thrown by this unless I'm missing something?

I confirmed with patch that create_autospec doesn't do what we expected, while autospec does:

>>> from unittest.mock import patch
>>> def test(a):
...     print(a)
... 

# In this test too many arguments are passed, which should result in an error, but it works just fine
>>> with patch('__main__.test', return_value="test", create_autospec=True) as patch_test:
...     print(test('too', 'many', 'args'))
... 
test

# With autospec=True we get the expected TypeError
>>> with patch('__main__.test', return_value="test", autospec=True) as patch_test:
...     print(test('too', 'many', 'args'))
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "<string>", line 2, in test
  File "/usr/lib64/python3.9/unittest/mock.py", line 180, in checksig
    sig.bind(*args, **kwargs)
  File "/usr/lib64/python3.9/inspect.py", line 3043, in bind
    return self._bind(args, kwargs)
  File "/usr/lib64/python3.9/inspect.py", line 2964, in _bind
    raise TypeError('too many positional arguments') from None
TypeError: too many positional arguments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants