Skip to content

Commit

Permalink
dbt-materialize: v1.8.0 migration (#27011)
Browse files Browse the repository at this point in the history
<!--
Describe the contents of the PR briefly but completely.

If you write detailed commit messages, it is acceptable to copy/paste
them
here, or write "see commit messages for details." If there is only one
commit
in the PR, GitHub will have already added its commit message above.
-->

### Motivation

Fixes #26226.

Initial PR to update the required dependencies for the dbt v1.8.0
migration.

---------

Co-authored-by: morsapaes <[email protected]>
  • Loading branch information
bobbyiliev and morsapaes authored May 23, 2024
1 parent 1104524 commit d285c1f
Show file tree
Hide file tree
Showing 21 changed files with 413 additions and 44 deletions.
16 changes: 6 additions & 10 deletions doc/user/content/manage/dbt/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -571,22 +571,18 @@ and correct results** with millisecond latency whenever you query your views.
[//]: # "TODO(morsapaes) Call out the cluster configuration for tests and
store_failures_as once this page is rehashed."
[//]: # "TODO(morsapaes) Add instructions for unit testing after the upcoming
v1.8 release of dbt Core."
### Configure continuous testing
Using dbt in a streaming context means that you're able to run data quality and
integrity [tests](https://docs.getdbt.com/docs/building-a-dbt-project/tests)
non-stop, and monitor failures as soon as they happen. This is useful for unit
testing during the development of your dbt models, and later in production to
non-stop. This is useful to monitor failures as soon as they happen, and
trigger **real-time alerts** downstream.
1. To configure your project for continuous testing, add a `tests` property to
1. To configure your project for continuous testing, add a `data_tests` property to
`dbt_project.yml` with the `store_failures` configuration:
```yaml
tests:
data_tests:
dbt_project.name:
models:
+store_failures: true
Expand All @@ -601,7 +597,7 @@ trigger **real-time alerts** downstream.
**Note:** As an alternative, you can specify the `--store-failures` flag
when running `dbt test`.
1. Add tests to your models using the `tests` property in the model
1. Add tests to your models using the `data_tests` property in the model
configuration `.yml` files:
```yaml
Expand All @@ -611,7 +607,7 @@ trigger **real-time alerts** downstream.
columns:
- name: col_a
description: 'column a description'
tests:
data_tests:
- not_null
- unique
```
Expand All @@ -623,7 +619,7 @@ trigger **real-time alerts** downstream.
1. Run the tests:
```bash
dbt test
dbt test # use --select test_type:data to only run data tests!
```
When configured to `store_failures`, this command will create a materialized
Expand Down
143 changes: 141 additions & 2 deletions doc/user/content/manage/dbt/development-workflows.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ using [dbt](/manage/dbt/) as your deployment tool.

When you're prototyping your use case and fine-tuning the underlying data model,
your priority is **iteration speed**. dbt has many features that can help speed
up development, like [node selection](#node-selection) and [model preview](#preview-model-results).
up development, like [node selection](#node-selection) and [model preview](#model-results-preview).
Before you start, we recommend getting familiar with how these features
work with the `dbt-materialize` adapter to make the most of your development
time.
Expand Down Expand Up @@ -97,7 +97,7 @@ dbt run --select "path/to/my_model.sql" # runs a specific model by its path

For a full rundown of selection logic options, check the [dbt documentation](https://docs.getdbt.com/reference/node-selection/syntax).

### Preview model results
### Model results preview

{{< note >}}
The `dbt show` command uses a `LIMIT` clause under the hood, which has
Expand Down Expand Up @@ -134,6 +134,145 @@ It's important to note that previewing results compiles the model and runs the
compiled SQL against Materialize; it doesn't query the already-materialized
database relation (see [`dbt-core` #7391](https://github.com/dbt-labs/dbt-core/issues/7391)).

### Unit tests

**Minimum requirements:** `dbt-materialize` v1.8.0+

{{< note >}}
Complex types like [`map`](/sql/types/map/) and [`list`](/sql/types/list/) are
not supported in unit tests yet (see [`dbt-adapters` #113](https://github.com/dbt-labs/dbt-adapters/issues/113)).
For an overview of other known limitations, check the [dbt documentation](https://docs.getdbt.com/docs/build/unit-tests#before-you-begin).
{{</ note >}}

To validate your SQL logic without fully materializing a model, as well as
future-proof it against edge cases, you can use [unit tests](https://docs.getdbt.com/docs/build/unit-tests).
Unit tests can be a **quicker way to iterate on model development** in
comparison to re-running the models, since you don't need to wait for a model
to hydrate before you can validate that it produces the expected results.

1. As an example, imagine your dbt project includes the following models:

**Filename:** _models/my_model_a.sql_
```sql
SELECT
1 AS a,
1 AS id,
2 AS not_testing,
'a' AS string_a,
DATE '2020-01-02' AS date_a
```

**Filename:** _models/my_model_b.sql_
```sql
SELECT
2 as b,
1 as id,
2 as c,
'b' as string_b
```

**Filename:** models/my_model.sql
```sql
SELECT
a+b AS c,
CONCAT(string_a, string_b) AS string_c,
not_testing,
date_a
FROM {{ ref('my_model_a')}} my_model_a
JOIN {{ ref('my_model_b' )}} my_model_b
ON my_model_a.id = my_model_b.id
```

1. To add a unit test to `my_model`, create a `.yml` file under the `/models`
directory, and use the [`unit_tests`](https://docs.getdbt.com/reference/resource-properties/unit-tests)
property:

**Filename:** _models/unit_tests.yml_
```yaml
unit_tests:
- name: test_my_model
model: my_model
given:
- input: ref('my_model_a')
rows:
- {id: 1, a: 1}
- input: ref('my_model_b')
rows:
- {id: 1, b: 2}
- {id: 2, b: 2}
expect:
rows:
- {c: 2}
```

For simplicity, this example provides mock data using inline dictionary
values, but other formats are supported. Check the [dbt documentation](https://docs.getdbt.com/reference/resource-properties/data-formats)
for a full rundown of the available options.

1. Run the unit tests using `dbt test`:

```bash
dbt test --select test_type:unit
12:30:14 Running with dbt=1.8.0
12:30:14 Registered adapter: materialize=1.8.0
12:30:14 Found 6 models, 1 test, 4 seeds, 1 source, 471 macros, 1 unit test
12:30:14
12:30:16 Concurrency: 1 threads (target='dev')
12:30:16
12:30:16 1 of 1 START unit_test my_model::test_my_model ................................. [RUN]
12:30:17 1 of 1 FAIL 1 my_model::test_my_model .......................................... [FAIL 1 in 1.51s]
12:30:17
12:30:17 Finished running 1 unit test in 0 hours 0 minutes and 2.77 seconds (2.77s).
12:30:17
12:30:17 Completed with 1 error and 0 warnings:
12:30:17
12:30:17 Failure in unit_test test_my_model (models/models/unit_tests.yml)
12:30:17
actual differs from expected:
@@ ,c
+++,3
---,2
```

It's important to note that the **direct upstream dependencies** of the
model that you're unit testing **must exist** in Materialize before you can
execute the unit test via `dbt test`. To ensure these dependencies exist,
you can use the `--empty` flag to build an empty version of the models:

```bash
dbt run --select "my_model_a.sql" "my_model_b.sql" --empty
```

Alternatively, you can execute unit tests as part of the `dbt build`
command, which will ensure the upstream depdendencies are created before
any unit tests are executed:

```bash
dbt build --select "+my_model.sql"
11:53:30 Running with dbt=1.8.0
11:53:30 Registered adapter: materialize=1.8.0
...
11:53:33 2 of 12 START sql view model public.my_model_a ................................. [RUN]
11:53:34 2 of 12 OK created sql view model public.my_model_a ............................ [CREATE VIEW in 0.49s]
11:53:34 3 of 12 START sql view model public.my_model_b ................................. [RUN]
11:53:34 3 of 12 OK created sql view model public.my_model_b ............................ [CREATE VIEW in 0.45s]
...
11:53:35 11 of 12 START unit_test my_model::test_my_model ............................... [RUN]
11:53:36 11 of 12 FAIL 1 my_model::test_my_model ........................................ [FAIL 1 in 0.84s]
11:53:36 Failure in unit_test test_my_model (models/models/unit_tests.yml)
11:53:36
actual differs from expected:
@@ ,c
+++,3
---,2
```

## Deployment

Once your dbt project is ready to move out of development, or as soon as you
Expand Down
13 changes: 12 additions & 1 deletion misc/dbt-materialize/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,24 @@
# dbt-materialize Changelog

## Unreleased

* Update base adapter references as part of
[decoupling migration from dbt-core](https://github.com/dbt-labs/dbt-adapters/discussions/87)
* Migrate to dbt-common and dbt-adapters packages.
* Add tests for `--empty` flag as part of [dbt-labs/dbt-core#8971](https://github.com/dbt-labs/dbt-core/pull/8971)
* Add functional tests for unit testing.
* Support enforcing model contracts for the [`map`](https://materialize.com/docs/sql/types/map/),
[`list`](https://materialize.com/docs/sql/types/list/),
and [`record`](https://materialize.com/docs/sql/types/record/) pseudo-types.

## 1.7.8 - 2024-05-06

* Fix permission management in blue/green automation macros for non-admin users
([#26733](https://github.com/MaterializeInc/materialize/pull/26773)).

## 1.7.7 - 2024-04-19

* Tweak [`deploy_permission_validation]`](https://github.com/MaterializeInc/materialize/blob/main/misc/dbt-materialize/dbt/include/materialize/macros/deploy/deploy_permission_validation.sql)
* Tweak [`deploy_permission_validation`](https://github.com/MaterializeInc/materialize/blob/main/misc/dbt-materialize/dbt/include/materialize/macros/deploy/deploy_permission_validation.sql)
macro to work around [#26738](https://github.com/MaterializeInc/materialize/issues/26738).

## 1.7.6 - 2024-04-18
Expand Down
1 change: 1 addition & 0 deletions misc/dbt-materialize/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,5 @@ FROM python:3.8.6

COPY . dbt-materialize/

RUN pip install pytest
RUN pip install ./dbt-materialize[dev]
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@
# limitations under the License.

# If you bump this version, bump it in setup.py too.
version = "1.7.8"
version = "1.8.0"
9 changes: 4 additions & 5 deletions misc/dbt-materialize/dbt/adapters/materialize/connections.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,12 @@
from dataclasses import dataclass
from typing import Optional

import dbt_common.exceptions
import psycopg2
from dbt_common.semver import versions_compatible

import dbt.adapters.postgres.connections
import dbt.exceptions
from dbt.adapters.events.logging import AdapterLogger
from dbt.adapters.postgres import PostgresConnectionManager, PostgresCredentials
from dbt.events import AdapterLogger
from dbt.semver import versions_compatible

# If you bump this version, bump it in README.md too.
SUPPORTED_MATERIALIZE_VERSIONS = ">=0.68.0"
Expand Down Expand Up @@ -97,7 +96,7 @@ def open(cls, connection):
mz_version = mz_version.split()[0] # e.g. v0.79.0-dev
mz_version = mz_version[1:] # e.g. 0.79.0-dev
if not versions_compatible(mz_version, SUPPORTED_MATERIALIZE_VERSIONS):
raise dbt.exceptions.DbtRuntimeError(
raise dbt_common.exceptions.DbtRuntimeError(
f"Detected unsupported Materialize version {mz_version}\n"
f" Supported versions: {SUPPORTED_MATERIALIZE_VERSIONS}"
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@

from typing import Any

from dbt.exceptions import CompilationError
from dbt_common.exceptions import CompilationError


class RefreshIntervalConfigNotDictError(CompilationError):
Expand Down
23 changes: 13 additions & 10 deletions misc/dbt-materialize/dbt/adapters/materialize/impl.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,13 @@
from dataclasses import dataclass
from typing import Any, Dict, List, Optional

import dbt.exceptions
import dbt_common.exceptions
from dbt_common.contracts.constraints import (
ColumnLevelConstraint,
ConstraintType,
)
from dbt_common.dataclass_schema import ValidationError, dbtClassMixin

from dbt.adapters.base.impl import AdapterConfig, ConstraintSupport
from dbt.adapters.base.meta import available
from dbt.adapters.capability import (
Expand All @@ -32,14 +38,9 @@
RefreshIntervalConfigNotDictError,
)
from dbt.adapters.materialize.relation import MaterializeRelation
from dbt.adapters.postgres import PostgresAdapter
from dbt.adapters.postgres.column import PostgresColumn
from dbt.adapters.postgres.impl import PostgresAdapter
from dbt.adapters.sql.impl import LIST_RELATIONS_MACRO_NAME
from dbt.contracts.graph.nodes import (
ColumnLevelConstraint,
ConstraintType,
)
from dbt.dataclass_schema import ValidationError, dbtClassMixin


# types in ./misc/dbt-materialize need to import generic types from typing
Expand All @@ -58,10 +59,12 @@ def parse(cls, raw_index) -> Optional["MaterializeIndexConfig"]:
cls.validate(raw_index)
return cls.from_dict(raw_index)
except ValidationError as exc:
msg = dbt.exceptions.validator_error_message(exc)
dbt.exceptions.CompilationError(f"Could not parse index config: {msg}")
msg = dbt_common.exceptions.validator_error_message(exc)
dbt_common.exceptions.CompilationError(
f"Could not parse index config: {msg}"
)
except TypeError:
dbt.exceptions.CompilationError(
dbt_common.exceptions.CompilationError(
"Invalid index config:\n"
f" Got: {raw_index}\n"
' Expected a dictionary with at minimum a "columns" key'
Expand Down
7 changes: 4 additions & 3 deletions misc/dbt-materialize/dbt/adapters/materialize/relation.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,10 @@
from dataclasses import dataclass
from typing import Optional, Type

from dbt.adapters.postgres import PostgresRelation
from dbt.dataclass_schema import StrEnum
from dbt.utils import classproperty
from dbt_common.dataclass_schema import StrEnum

from dbt.adapters.postgres.relation import PostgresRelation
from dbt.adapters.utils import classproperty


# types in ./misc/dbt-materialize need to import generic types from typing
Expand Down
Loading

0 comments on commit d285c1f

Please sign in to comment.