Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

post run pipeline state introspection #184

Closed
rudolfix opened this issue Mar 15, 2023 · 0 comments · Fixed by #199
Closed

post run pipeline state introspection #184

rudolfix opened this issue Mar 15, 2023 · 0 comments · Fixed by #199
Assignees

Comments

@rudolfix
Copy link
Collaborator

rudolfix commented Mar 15, 2023

Background
Checking what happened with the data during pipeline run should be easier both in the Python code and cli.

Tasks

    • document how to get state and trace via cli and code
    • document delete_completed_jobs in load. False by default. document what is left behind
    • change it to delete only completed jobs as it was the case before.
    • add method to pipeline remove completed package fully.
    • load storage should return a status of the package with schema name, jobs, table names and statuses. failed packages should contain error names. also links to files should be present. the info should be json loadable with dlt user log /callback log #73
    • load module should persist the effective schema updates: a list of tables and columns that were added/updated. it should include flags for variant columns. see Schema change log access #134
    • document PipelineStepFailed and __context__ exception chaining. document terminal vs. non terminal exceptions when loading. dlt does not retry the run or any other step
    • add a config parameter to fail loading on a terminal exception in load job
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant