Merge branch 'current' into mwan-control-plane-blog

dbt-labs · Jan 21, 2025 · 89592f6 · 89592f6
2 parents 5082e6c + 0458e50
commit 89592f6
Show file tree

Hide file tree

Showing 52 changed files with 1,011 additions and 858 deletions.
diff --git a/website/docs/docs/build/custom-aliases.md b/website/docs/docs/build/custom-aliases.md
@@ -157,3 +157,8 @@ If these models should indeed have the same database identifier, you can work ar
 
 By default, dbt will create versioned models with the alias `<model_name>_v<v>`, where `<v>` is that version's unique identifier. You can customize this behavior just like for non-versioned models by configuring a custom `alias` or re-implementing the `generate_alias_name` macro.
 
+## Related docs
+
+- [Customize dbt models database, schema, and alias](/guides/customize-schema-alias?step=1) to learn how to customize dbt models database, schema, and alias
+- [Custom schema](/docs/build/custom-schemas) to learn how to customize dbt schema 
+- [Custom database](/docs/build/custom-databases) to learn how to customize dbt database
diff --git a/website/docs/docs/build/custom-databases.md b/website/docs/docs/build/custom-databases.md
@@ -98,3 +98,9 @@ See docs on macro `dispatch`: ["Managing different global overrides across packa
 ### BigQuery
 
 When dbt opens a BigQuery connection, it will do so using the `project_id` defined in your active `profiles.yml` target. This `project_id` will be billed for the queries that are executed in the dbt run, even if some models are configured to be built in other projects.
+
+## Related docs
+
+- [Customize dbt models database, schema, and alias](/guides/customize-schema-alias?step=1) to learn how to customize dbt models database, schema, and alias
+- [Custom schema](/docs/build/custom-schemas) to learn how to customize dbt model schema
+- [Custom aliases](/docs/build/custom-aliases) to learn how to customize dbt model alias name
diff --git a/website/docs/docs/build/custom-schemas.md b/website/docs/docs/build/custom-schemas.md
@@ -207,3 +207,9 @@ In the `generate_schema_name` macro examples shown in the [built-in alternative
 If your schema names are being generated incorrectly, double-check your target name in the relevant environment.
 
 For more information, consult the [managing environments in dbt Core](/docs/core/dbt-core-environments) guide.
+
+## Related docs
+
+- [Customize dbt models database, schema, and alias](/guides/customize-schema-alias?step=1) to learn how to customize dbt models database, schema, and alias
+- [Custom database](/docs/build/custom-databases) to learn how to customize dbt model database
+- [Custom aliases](/docs/build/custom-aliases) to learn how to customize dbt model alias name
diff --git a/website/docs/docs/build/data-tests.md b/website/docs/docs/build/data-tests.md
@@ -73,7 +73,9 @@ having total_amount < 0
 
 The name of this test is the name of the file: `assert_total_payment_amount_is_positive`. 
 
-Note, you won't need to include semicolons (;) at the end of the SQL statement in your singular test files as it can cause your test to fail.
+Note:
+- Omit semicolons (;) at the end of the SQL statement in your singular test files, as they can cause your test to fail.
+- Singular tests placed in the tests directory are automatically executed when running `dbt test`. Don't reference singular tests in `model_name.yml`, as they are not treated as generic tests or macros, and doing so will result in an error.
 
 To add a description to a singular test in your project, add a `.yml` file to your `tests` directory, for example, `tests/schema.yml` with the following content:
 

diff --git a/website/docs/docs/build/enhance-your-code.md b/website/docs/docs/build/enhance-your-code.md
@@ -7,21 +7,17 @@ pagination_prev: null
 
 <div className="grid--2-col">
 
-<Card
-    title="Project variables"
-    body="Learn how to use project variables to provide data to models for compilation."
-    link="/docs/build/project-variables"
-    icon="dbt-bit"/>
-
 <Card
     title="Environment variables"
     body="Learn how you can use environment variables to customize the behavior of a dbt project."
     link="/docs/build/environment-variables"
     icon="dbt-bit"/>
 
-</div>
-<br />
-<div className="grid--2-col">
+<Card
+    title="Hooks and operations"
+    body="Learn how to use hooks to trigger actions and operations to invoke macros."
+    link="/docs/build/hooks-operations"
+    icon="dbt-bit"/>
 
 <Card
     title="Packages"
@@ -30,9 +26,9 @@ pagination_prev: null
     icon="dbt-bit"/>
 
 <Card
-    title="Hooks and operations"
-    body="Learn how to use hooks to trigger actions and operations to invoke macros."
-    link="/docs/build/hooks-operations"
-    icon="dbt-bit"/>
+    title="Project variables"
+    body="Learn how to use project variables to provide data to models for compilation."
+    link="/docs/build/project-variables"
+    icon="dbt-bit"/>    
 
-</div>
+</div>
diff --git a/website/docs/docs/build/incremental-microbatch.md b/website/docs/docs/build/incremental-microbatch.md
@@ -29,14 +29,27 @@ Microbatch is an incremental strategy designed for large time-series datasets:
 
 - Note, microbatch might not be the best strategy for all use cases. Consider other strategies for use cases such as not having a reliable `event_time` column or if you want more control over the incremental logic. Read more in [How `microbatch` compares to other incremental strategies](#how-microbatch-compares-to-other-incremental-strategies).
 
-### How microbatch works
+## How microbatch works
 
 When dbt runs a microbatch model — whether for the first time, during incremental runs, or in specified backfills — it will split the processing into multiple queries (or "batches"), based on the `event_time` and `batch_size` you configure.
 
 Each "batch" corresponds to a single bounded time period (by default, a single day of data). Where other incremental strategies operate only on "old" and "new" data, microbatch models treat every batch as an atomic unit that can be built or replaced on its own. Each batch is independent and <Term id="idempotent" />. 
 
 This is a powerful abstraction that makes it possible for dbt to run batches [separately](#backfills), concurrently, and [retry](#retry) them independently.
 
+### Adapter-specific behavior
+
+dbt's microbatch strategy uses the most efficient mechanism available for "full batch" replacement on each adapter. This can vary depending on the adapter:
+
+- `dbt-postgres`: Uses the `merge` strategy, which performs "update" or "insert" operations.
+- `dbt-redshift`: Uses the `delete+insert` strategy, which "inserts" or "replaces."
+- `dbt-snowflake`: Uses the `delete+insert` strategy, which "inserts" or "replaces."
+- `dbt-bigquery`: Uses the `insert_overwrite` strategy, which "inserts" or "replaces."
+- `dbt-spark`: Uses the `insert_overwrite` strategy, which "inserts" or "replaces."
+- `dbt-databricks`: Uses the `replace_where` strategy, which "inserts" or "replaces." 
+
+Check out the [supported incremental strategies by adapter](/docs/build/incremental-strategy#supported-incremental-strategies-by-adapter) for more info.
+
 ## Example
 
 A `sessions` model aggregates and enriches data that comes from two other models:
@@ -170,7 +183,7 @@ customers as (
 
 </Tabs>
 
-dbt will instruct the data platform to take the result of each batch query and insert, update, or replace the contents of the `analytics.sessions` table for the same day of data. To perform this operation, dbt will use the most efficient atomic mechanism for "full batch" replacement that is available on each data platform.
+dbt will instruct the data platform to take the result of each batch query and [insert, update, or replace](#adapter-specific-behavior) the contents of the `analytics.sessions` table for the same day of data. To perform this operation, dbt will use the most efficient atomic mechanism for "full batch" replacement that is available on each data platform. For details, see [How microbatch works](#how-microbatch-works).
 
 It does not matter whether the table already contains data for that day. Given the same input data, the resulting table is the same no matter how many times a batch is reprocessed.
 

diff --git a/website/docs/docs/build/incremental-strategy.md b/website/docs/docs/build/incremental-strategy.md
@@ -1,5 +1,6 @@
 ---
 title: "About incremental strategy"
+sidebar_label: "About incremental strategy"
 description: "Learn about the various ways (strategies) to implement incremental materializations."
 id: "incremental-strategy"
 ---

diff --git a/website/docs/docs/build/metricflow-commands.md b/website/docs/docs/build/metricflow-commands.md
@@ -77,7 +77,7 @@ The following table lists the commands compatible with the dbt Cloud IDE and dbt
 | [`list dimension-values`](#list-dimension-values) | List dimensions with metrics. | ✅ | ✅ |
 | [`list entities`](#list-entities) | Lists all unique entities.  |  ✅  | ✅ |
 | [`list saved-queries`](#list-saved-queries) | Lists available saved queries. Use the `--show-exports` flag to display each export listed under a saved query or `--show-parameters` to show the full query parameters each saved query uses. |  ✅ | ✅ |
-| [`query`](#query) | Query metrics, saved queries, and dimensions you want to see in the command line interface. Refer to [query examples](#query-examples) to help you get started.  |  ✅ | ✅ |
+| [`query`](#query) | Query metrics, saved queries, and dimensions you want to see in the command line interface. Refer to [query examples](#query-examples) to query metrics and dimensions (such as querying metrics, using the `where` filter, adding an `order`, and more).  |  ✅ | ✅ |
 | [`validate`](#validate) | Validates semantic model configurations. |  ✅ | ✅ |
 | [`export`](#export) |  Runs exports for a singular saved query for testing and generating exports in your development environment. You can also use the `--select` flag to specify particular exports from a saved query. |  ❌ | ✅ |
 | [`export-all`](#export-all) | Runs exports for multiple saved queries at once, saving time and effort. |  ❌ | ✅ |
@@ -118,7 +118,7 @@ Use the `mf` prefix before the command name to execute them in dbt Core. For exa
 </TabItem>
 </Tabs>
 
-### List metrics
+## List metrics
 This command lists the metrics with their available dimensions:
 
 ```bash
@@ -132,7 +132,7 @@ Options:
   --help                 Show this message and exit.
 ```
 
-### List dimensions
+## List dimensions
 
 This command lists all unique dimensions for a metric or multiple metrics. It displays only common dimensions when querying multiple metrics:
 
@@ -146,7 +146,7 @@ Options:
   --help              Show this message and exit.
 ```
 
-### List dimension-values
+## List dimension-values
 
 This command lists all dimension values with the corresponding metric:
 
@@ -168,7 +168,7 @@ Options:
   --help              Show this message and exit.
 ```
 
-### List entities
+## List entities
 
 This command lists all unique entities:
 
@@ -182,7 +182,7 @@ Options:
   --help              Show this message and exit.
 ```
 
-### List saved queries
+## List saved queries
 
 This command lists all available saved queries:
 
@@ -209,7 +209,7 @@ The list of available saved queries:
        - Export(new_customer_orders, alias=orders, schemas=customer_schema, exportAs=TABLE)
 ```
 
-### Validate
+## Validate
 
 The following command performs validations against the defined semantic model configurations.
 
@@ -234,7 +234,7 @@ Options:
   --help                          Show this message and exit.
 ```
 
-### Health checks
+## Health checks
 
 The following command performs a health check against the data platform you provided in the configs. 
 
@@ -244,7 +244,7 @@ Note, in dbt Cloud the `health-checks` command isn't required since it uses dbt
 mf health-checks # In dbt Core
 ```
 
-### Tutorial
+## Tutorial
 
 Follow the dedicated MetricFlow tutorial to help you get started:
 <!--dbt sl tutorial # In dbt Cloud-->
@@ -253,7 +253,7 @@ Follow the dedicated MetricFlow tutorial to help you get started:
 mf tutorial # In dbt Core
 ```
 
-### Query
+## Query
 
 Create a new query with MetricFlow and execute it against your data platform. The query returns the following result:
 
@@ -284,10 +284,11 @@ Options:
                            time of the data (inclusive)
                            *Not available in dbt Cloud yet
 
-  --where TEXT             SQL-like where statement provided as a string and wrapped in quotes: --where "condition_statement"
-                           For example, to query a single statement: --where "revenue > 100"
-                           To query multiple statements: --where "revenue > 100 and user_count < 1000"
-                           To add a dimension filter to a where filter, ensure the filter item is part of your model. 
+  --where TEXT             SQL-like where statement provided as a string and wrapped in quotes.
+                           All filter items must explicitly reference fields or dimensions that are part of your model.
+                           To query a single statement: ---where "{{ Dimension('order_id__revenue') }} > 100"
+                           To query multiple statements: --where "{{ Dimension('order_id__revenue') }} > 100 and {{ Dimension('user_count') }} < 1000"
+                           To add a dimension filter, use the `Dimension()` template wrapper to indicate that the filter item is part of your model. 
                            Refer to the [FAQ](#faqs) for more info on how to do this using a template wrapper.
 
   --limit TEXT             Limit the number of rows out using an int or leave
@@ -318,13 +319,18 @@ Options:
   ```
 
 
-### Query examples
+## Query examples
 
-The following tabs present various types of query examples that you can use to query metrics and dimensions. Select the tab that best suits your needs:
+This section shares various types of query examples that you can use to query metrics and dimensions. The query examples listed are:
 
-<Tabs>
+- [Query metrics](#query-metrics)
+- [Query dimensions](#query-dimensions)
+- [Add `order`/`limit` function](#add-orderlimit)
+- [Add `where` clause](#add-where-clause)
+- [Filter by time](#filter-by-time)
+- [Query saved queries](#query-saved-queries)
 
-<TabItem value="eg1" label="Metrics">
+### Query metrics
 
 Use the example to query multiple metrics by dimension and return the `order_total` and `users_active` metrics by `metric_time.` 
 
@@ -347,9 +353,8 @@ mf query --metrics order_total,users_active --group-by metric_time # In dbt Core
 | 2017-06-20    |         712.51 |
 | 2017-06-21    |         541.65 |
 ```
-</TabItem>
 
-<TabItem value="eg2" label="Dimensions">
+### Query dimensions
 
 You can include multiple dimensions in a query. For example, you can group by the `is_food_order` dimension to confirm if orders were for food or not.  Note that when you query a dimension, you need to specify the primary entity for that dimension. In the following example, the primary entity is `order_id`.
 
@@ -375,9 +380,7 @@ mf query --metrics order_total --group-by order_id__is_food_order # In dbt Core
 | 2017-06-19    | True            |         448.11 |
 ```
 
-</TabItem>
-
-<TabItem value="eg3" label="Order/limit">
+### Add order/limit
 
 You can add order and limit functions to filter and present the data in a readable format. The following query limits the data set to 10 records and orders them by `metric_time`, descending. Note that using the `-` prefix will sort the query in descending order. Without the `-` prefix sorts the query in ascending order.
 
@@ -405,21 +408,24 @@ mf query --metrics order_total --group-by order_id__is_food_order --limit 10 --o
 | 2017-08-29    | False           |         333.65 |
 | 2017-08-28    | False           |         334.73 |
 ```
-</TabItem>
 
-<TabItem value="eg4" label="where clause">
+### Add where clause
 
-You can further filter the data set by adding a `where` clause to your query. The following example shows you how to query the `order_total` metric, grouped by `is_food_order` with multiple where statements (orders that are food orders and orders from the week starting on or after Feb 1st, 2024). Note that when you query a dimension, you need to specify the primary entity for that dimension. In the following example, the primary entity is `order_id`.
+You can further filter the data set by adding a `where` clause to your query. The following example shows you how to query the `order_total` metric, grouped by `is_food_order` with multiple `where` statements (orders that are food orders and orders from the week starting on or after Feb 1st, 2024). 
 
 **Query**
 ```bash
 # In dbt Cloud 
-dbt sl query --metrics order_total --group-by order_id__is_food_order --where "{{ Dimension('order_id__is_food_order') }} = True and metric_time__week >= '2024-02-01'"
+dbt sl query --metrics order_total --group-by order_id__is_food_order --where "{{ Dimension('order_id__is_food_order') }} = True and {{ TimeDimension('metric_time', 'week') }} >= '2024-02-01'"
 
 # In dbt Core
-mf query --metrics order_total --group-by order_id__is_food_order --where "{{ Dimension('order_id__is_food_order') }} = True and metric_time__week >= '2024-02-01'" 
+mf query --metrics order_total --group-by order_id__is_food_order --where "{{ Dimension('order_id__is_food_order') }} = True and TimeDimension('metric_time', 'week') }} >= '2024-02-01'"
 ```
 
+Notes:
+- The type of dimension changes the syntax you use. So if you have a date field, use `TimeDimension` instead of `Dimension`.
+- When you query a dimension, you need to specify the primary entity for that dimension. In the example just shared, the primary entity is `order_id`.
+
 **Result**
 ```bash
  ✔ Success 🦄 - query completed after 1.06 seconds
@@ -437,9 +443,7 @@ mf query --metrics order_total --group-by order_id__is_food_order --where "{{ Di
 | 2017-08-22    | True            |         401.91 |
 ```
 
-</TabItem>
-
-<TabItem value="eg5" label=" Filter by time">
+### Filter by time
 
 To filter by time, there are dedicated start and end time options. Using these options to filter by time allows MetricFlow to further optimize query performance by pushing down the where filter when appropriate. 
 
@@ -468,9 +472,7 @@ mf query --metrics order_total --group-by order_id__is_food_order --limit 10 --o
 | 2017-08-22    | True            |         401.91 |
 ```
 
-</TabItem>
-
-<TabItem value="eg6" label=" Saved queries">
+### Query saved queries
 
 You can use this for frequently used queries. Replace `<name>` with the name of your [saved query](/docs/build/saved-queries). 
 
@@ -487,10 +489,7 @@ For example, if you use dbt Cloud and have a saved query named `new_customer_ord
 When querying [saved queries](/docs/build/saved-queries), you can use parameters such as `where`, `limit`, `order`, `compile`, and so on. However, keep in mind that you can't access `metric` or `group_by` parameters in this context. This is because they are predetermined and fixed parameters for saved queries, and you can't change them at query time. If you would like to query more metrics or dimensions, you can build the query using the standard format.
 :::
 
-</TabItem>
-</Tabs>
-
-### Additional query examples
+## Additional query examples
 
 The following tabs present additional query examples, like exporting to a CSV. Select the tab that best suits your needs:
 
@@ -559,7 +558,7 @@ mf query --metrics order_total --group-by metric_time,is_food_order --limit 10 -
 </TabItem>
 </Tabs>
 
-### Time granularity
+## Time granularity
 
 Optionally, you can specify the time granularity you want your data to be aggregated at by appending two underscores and the unit of granularity you want to `metric_time`, the global time dimension. You can group the granularity by: `day`, `week`, `month`, `quarter`, and `year`. 
 
@@ -571,7 +570,7 @@ dbt sl query --metrics revenue --group-by metric_time__month # In dbt Cloud
 mf query --metrics revenue --group-by metric_time__month # In dbt Core
 ```
 
-### Export
+## Export
 
 Run [exports for a specific saved query](/docs/use-dbt-semantic-layer/exports#exports-for-single-saved-query). Use this command to test and generate exports in your development environment. You can also use the `--select` flag to specify particular exports from a saved query. Refer to [exports in development](/docs/use-dbt-semantic-layer/exports#exports-in-development) for more info. 
 
@@ -581,7 +580,7 @@ Export is available in dbt Cloud.
 dbt sl export 
 ```
 
-### Export-all
+## Export-all
 
 Run [exports for multiple saved queries](/docs/use-dbt-semantic-layer/exports#exports-for-multiple-saved-queries) at once. This command provides a convenient way to manage and execute exports for several queries simultaneously, saving time and effort. Refer to [exports in development](/docs/use-dbt-semantic-layer/exports#exports-in-development) for more info.