-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Guide for 'Connecting to databases' (#24062)
## Summary & Motivation New docs guide for "Connecting to Databases" ## Changelog [New | Bug | Docs] NOCHANGELOG --------- Co-authored-by: colton <[email protected]>
- Loading branch information
Showing
4 changed files
with
212 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,61 @@ | ||
--- | ||
title: Connecting to databases | ||
description: How to configure resources to connect to databases | ||
sidebar_position: 10 | ||
--- | ||
--- | ||
|
||
In Dagster, *resources* are used to connect to databases by acting as a wrapper around database clients. The resource is registered along with connection details in the `Definitions` object, and can then be referenced from your asset definitions. | ||
|
||
## What you'll learn | ||
|
||
- How to connect to and query a local DuckDB database using the `DuckDBResource` | ||
- How to connect to different databases in different environments, such as development and production. | ||
- How to connect to a Snowflake database using the `SnowflakeResource` | ||
|
||
<details> | ||
<summary>Prerequisites</summary> | ||
|
||
To follow the steps in this guide, you'll need: | ||
|
||
- Familiarity with [Asset definitions](/concepts/assets) | ||
|
||
If you want to run the examples in this guide, you'll need: | ||
- Connection information for a Snowflake database | ||
- To `pip install` the `dagster-duckdb` and `dagster-snowflake` packages | ||
|
||
</details> | ||
|
||
## Define a DuckDB resource and use it in an asset definition | ||
|
||
Here is an example of a DuckDB resource definition that's used to create two tables in the DuckDB database. | ||
|
||
<CodeExample filePath="guides/external-systems/resource-duckdb-example.py" language="python" title="DuckDB Resource Example" /> | ||
|
||
## Define a resource that depends on an environment variable | ||
|
||
Resources can be configured using environment variables to connect to environment-specific databases. For example, a resource can connect to a test database in a development environment and a live database in the production environment. You can change the resource definition in the previous example to use an `EnvVar` as shown here: | ||
|
||
<CodeExample filePath="guides/external-systems/resource-duckdb-envvar-example.py" language="python" title="DuckDB Resource using EnvVar Example" /> | ||
|
||
When launching a run, the database path will be read from the `IRIS_DUCKDB_PATH` environment variable. | ||
|
||
## Define a Snowflake resource and use it in an asset definition | ||
|
||
Using the Snowflake resource is similar to using the DuckDB resource. Here is a complete example showing how to connect to a Snowflake database and create two tables: | ||
|
||
<CodeExample filePath="guides/external-systems/resource-snowflake-example.py" language="python" title="Snowflake Resource Example" /> | ||
|
||
**Note:** before running this example, you will need to set the `SNOWFLAKE_PASSWORKD` environment variable. | ||
|
||
## Other database resource types | ||
|
||
See [Dagster Integrations](https://dagster.io/integrations) for resource types that connect to other databases. Some other popular resource types are: | ||
|
||
* [`BigQueryResource`](https://dagster.io/integrations/dagster-gcp-bigquery) | ||
* [`RedshiftClientResource`](https://dagster.io/integrations/dagster-aws-redshift) | ||
|
||
## Next steps | ||
|
||
- Explore how to use resources for [Connecting to APIs](/guides/external-systems/apis) | ||
- Go deeper into [Understanding Resources](/concepts/resources) | ||
|
47 changes: 47 additions & 0 deletions
47
...eta_snippets/docs_beta_snippets/guides/external-systems/resource-duckdb-envvar-example.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
import pandas as pd | ||
from dagster_duckdb import DuckDBResource | ||
|
||
import dagster as dg | ||
|
||
|
||
# An asset that uses a DuckDb resource called iris_db | ||
@dg.asset | ||
def iris_dataset(iris_db: DuckDBResource) -> None: | ||
iris_df = pd.read_csv( | ||
"https://docs.dagster.io/assets/iris.csv", | ||
names=[ | ||
"sepal_length_cm", | ||
"sepal_width_cm", | ||
"petal_length_cm", | ||
"petal_width_cm", | ||
"species", | ||
], | ||
) | ||
|
||
with iris_db.get_connection() as conn: | ||
conn.execute("CREATE SCHEMA IF NOT EXISTS iris") | ||
conn.execute("CREATE TABLE iris.iris_dataset AS SELECT * FROM iris_df") | ||
|
||
|
||
# Another asset that uses the iris_db resource | ||
@dg.asset(deps=[iris_dataset]) | ||
def iris_setosa(iris_db: DuckDBResource) -> None: | ||
with iris_db.get_connection() as conn: | ||
conn.execute( | ||
"CREATE TABLE iris.iris_setosa AS SELECT * FROM iris.iris_dataset WHERE" | ||
" species = 'Iris-setosa'" | ||
) | ||
|
||
|
||
defs = dg.Definitions( | ||
assets=[iris_dataset, iris_setosa], | ||
resources={ | ||
# highlight-start | ||
# This defines a DuckDB resource that reads the | ||
# from the environment | ||
"iris_db": DuckDBResource( | ||
database=dg.EnvVar("IRIS_DUCKDB_PATH"), | ||
) | ||
# highlight-end | ||
}, | ||
) |
51 changes: 51 additions & 0 deletions
51
.../docs_beta_snippets/docs_beta_snippets/guides/external-systems/resource-duckdb-example.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
import pandas as pd | ||
from dagster_duckdb import DuckDBResource | ||
|
||
import dagster as dg | ||
|
||
|
||
# highlight-start | ||
# An asset that uses a DuckDb resource called iris_db | ||
# Note the parameter name `iris_db` must match the resource defined later | ||
@dg.asset | ||
def iris_dataset(iris_db: DuckDBResource) -> None: | ||
# highlight-end | ||
iris_df = pd.read_csv( | ||
"https://docs.dagster.io/assets/iris.csv", | ||
names=[ | ||
"sepal_length_cm", | ||
"sepal_width_cm", | ||
"petal_length_cm", | ||
"petal_width_cm", | ||
"species", | ||
], | ||
) | ||
|
||
# highlight-start | ||
with iris_db.get_connection() as conn: | ||
conn.execute("CREATE SCHEMA IF NOT EXISTS iris") | ||
conn.execute("CREATE TABLE iris.iris_dataset AS SELECT * FROM iris_df") | ||
# highlight-end | ||
|
||
|
||
# Another asset that uses the iris_db resource | ||
@dg.asset(deps=[iris_dataset]) | ||
def iris_setosa(iris_db: DuckDBResource) -> None: | ||
with iris_db.get_connection() as conn: | ||
conn.execute( | ||
"CREATE TABLE iris.iris_setosa AS SELECT * FROM iris.iris_dataset WHERE" | ||
" species = 'Iris-setosa'" | ||
) | ||
|
||
|
||
defs = dg.Definitions( | ||
assets=[iris_dataset, iris_setosa], | ||
resources={ | ||
# highlight-start | ||
# This defines a DuckDB resource called iris_db | ||
"iris_db": DuckDBResource( | ||
database="/tmp/iris_dataset.duckdb", | ||
) | ||
# highlight-end | ||
}, | ||
) |
56 changes: 56 additions & 0 deletions
56
...cs_beta_snippets/docs_beta_snippets/guides/external-systems/resource-snowflake-example.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
import pandas as pd | ||
from dagster_snowflake import SnowflakeResource | ||
from snowflake.connector.pandas_tools import write_pandas | ||
|
||
import dagster as dg | ||
|
||
|
||
# An asset that uses a Snowflake resource called iris_db | ||
# and creates a new table from a Pandas dataframe | ||
@dg.asset | ||
def iris_dataset(iris_db: SnowflakeResource) -> None: | ||
iris_df = pd.read_csv( | ||
"https://docs.dagster.io/assets/iris.csv", | ||
names=[ | ||
"sepal_length_cm", | ||
"sepal_width_cm", | ||
"petal_length_cm", | ||
"petal_width_cm", | ||
"species", | ||
], | ||
) | ||
|
||
with iris_db.get_connection() as conn: | ||
write_pandas(conn, iris_df, table_name="iris_dataset") | ||
|
||
|
||
# An asset that uses a Snowflake resource called iris_db | ||
# and creates a new table from an existing table | ||
@dg.asset(deps=[iris_dataset]) | ||
def iris_setosa(iris_db: SnowflakeResource) -> None: | ||
with iris_db.get_connection() as conn: | ||
conn.cursor().execute(""" | ||
CREATE OR REPALCE TABLE iris_setosa as ( | ||
SELECT * | ||
FROM iris.iris_dataset | ||
WHERE species = 'Iris-setosa' | ||
);""") | ||
|
||
|
||
defs = dg.Definitions( | ||
assets=[iris_dataset, iris_setosa], | ||
resources={ | ||
# highlight-start | ||
"iris_db": SnowflakeResource( | ||
# Set the SNOWFLAKE_PASSWORD environment variables before running this code | ||
password=dg.EnvVar("SNOWFLAKE_PASSWORD"), | ||
# Update the following strings to match your snowflake database | ||
warehouse="snowflake_warehouse", | ||
account="snowflake_account", | ||
user="snowflake_user", | ||
database="iris_database", | ||
schema="iris_schema", | ||
) | ||
# highlight-end | ||
}, | ||
) |
f7a5d66
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deploy preview for dagster-docs-beta ready!
✅ Preview
https://dagster-docs-beta-95wismed7-elementl.vercel.app
https://dagster-docs-beta.dagster-docs.io
Built with commit f7a5d66.
This pull request is being automatically deployed with vercel-action