Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable dlt to run on managed AWS airflow MWAA #995

Open
adrianbr opened this issue Feb 23, 2024 · 3 comments
Open

Enable dlt to run on managed AWS airflow MWAA #995

adrianbr opened this issue Feb 23, 2024 · 3 comments
Labels
bug Something isn't working community This issue came from slack community workspace

Comments

@adrianbr
Copy link
Contributor

adrianbr commented Feb 23, 2024

Feature description

User:
"I've been at a stand still actually. It looks to me like the constraints from MWAA are going to prohibit me from using Snowflake as a destination. Writing to S3 is a viable replacement- but still having trouble here as well. The most recent version of airflow available on MWAA is 2.7.2.
At the moment, airflow is showing some import conflicts. It can't see the path to locate PipelineTasksGroup in my dag, nor can it find DltResource in init.py.
There is also a version conflict with s3fs which relies on aiobotocore. The constraint on aiobotocore here is 2.6.0 which is not"

Are you a dlt user?

I'd consider using dlt, but it's lacking a feature I need.

Use case

Run airflow on managed AWS with dlt. Library conflicts, seems S3 could be easier .

You can ask more information from the user here https://dlthub-community.slack.com/archives/C04DQA7JJN6/p1707496343595039

@rudolfix rudolfix added the bug Something isn't working label Feb 24, 2024
@rudolfix
Copy link
Collaborator

@adrianbr the only way to fix it is to try that ourselves.

  1. there's local runner https://docs.aws.amazon.com/mwaa/latest/userguide/working-dags-dependencies.html https://github.com/aws/aws-mwaa-local-runner/tree/v2.7.2
  2. we need to figure out how to run our helper on it
  3. possibly update our airflow CI workflow to run some dags on MWAA

@rudolfix rudolfix added the community This issue came from slack community workspace label Mar 3, 2024
@rubenhelsloot
Copy link

I might be able to help here. I have been using dlt with MWAA successfully. I'm not writing to Snowflake, but have run into problems with the constraints multiple times.

My workaround was to add --constraints /dev/null to the uploaded requirements.txt file, which overrides the default constraints imposed by Airflow. It seems hacky, but AWS actually encourages this if you don't like the imposed constraints.

Idea came from this article

@jabjakub
Copy link

jabjakub commented Aug 26, 2024

we had a similar issue on v2.2.2 (--constraints not needed in this version though) and decided to install additional packages directly in a virtual environement using the PythonVirtualenvOperator operator. First virtualenv needs to be part of your requirements.txt

https://docs.aws.amazon.com/mwaa/latest/userguide/samples-virtualenv.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working community This issue came from slack community workspace
Projects
Status: Todo
Development

No branches or pull requests

4 participants