Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a cli command to clean up all metadata related to a specific pipeline #653

Open
francescomucio opened this issue Sep 25, 2023 · 2 comments

Comments

@francescomucio
Copy link
Contributor

Feature description

During the development of a new pipeline, it could be necessary to clean up all the metadata of a pipeline or associated tables.

For example, dropping the tables created by dlt could result in the dlt library attempting to truncate tables that do not exist anymore. It could be due to a schema definition existing somewhere, either in the DLT folder or in the destination database.

Residual metadata related to a specific pipeline in the _dlt_version and _dlt_pipeline_state tables or in the local caches in the ~/.dlt/pipelines directory.

Therefore, it's important to ensure that all metadata related to a specific pipeline is properly cleaned up before running the pipeline. This includes deleting the tables in the destination schema and the _staging one, cleaning up the data for that pipeline in the ~/.dlt/pipelines directory, and deleting the rows associated with the right pipelines in the _dlt_version and _dlt_pipeline_state tables.

It would be nice to have a cli command like dlt pipeline pipeline_name clean

Are you a dlt user?

None

Use case

No response

Proposed solution

It would be nice to have a cli command like dlt pipeline pipeline_name clean

Related issues

No response

@rudolfix
Copy link
Collaborator

@francescomucio what about this: https://dlthub.com/docs/reference/command-line-interface#selectively-drop-tables-and-reset-state I think it does exactly what you want

  • drops tables and cleans up associated state
  • if the tables are dropped just cleans up the state
  • you can drop any part of state separately
  • resets the schemas for the dropped tables (both in working folder and in database)

@francescomucio
Copy link
Contributor Author

Hi @rudolfix , I didn't know about the drop command and I am here after this discussion in the dlt Slack.

Unfortunately I didn't know about the pipeline drop and I am not able to test it. Does the drop also clean the local cache in ~/.dlt/pipelines?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

2 participants