Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

user log /callback log #73

Open
adrianbr opened this issue Nov 3, 2022 · 2 comments
Open

user log /callback log #73

adrianbr opened this issue Nov 3, 2022 · 2 comments
Assignees

Comments

@adrianbr
Copy link
Contributor

adrianbr commented Nov 3, 2022

We need user-readable logs

  • what loaded (source) where (hostname/destination/project/dataset)
  • ideally nr of
  • schema changes

and callback parameters

  • Pipeline/ task name

  • env name - In airflow you have a base url that tells you which instance is failing.

  • logs url

  • Schema changes

  • error message

  • success message:

    • how many rows were loaded to what table
    • if runs metadata is logged persistently then give a summary if the vol of rows is normal (compare to last runs)

Partial success:

  • unless intentionally configured, this is a non-atomic load and should be avoided.
  • report failures - offer way to retry? offer way to

Retry:

  • keep track of retries to be able to report them on final fail
  • optionally report failures of retries. In some patterns you will want to turn off retry notification and only notify failure.
@adrianbr
Copy link
Contributor Author

adrianbr commented Nov 4, 2022

possibly offer a way for an automated "lateness" alert based on past runs time, for example alert anything that is >3 std from avg as a simple toggle

reason: sometimes the download might take much longer due to more data produced (could be 100x more in db migration/reconciliation cases), sometimes, the download might be stuck. In a different case (airflow separating scheduler from runner) the SLA check is done by the scheduler because the runner might be dead (zombie - dead but did not report failure, or running but disconnected from reporting success)

@adrianbr
Copy link
Contributor Author

adrianbr commented Nov 7, 2022

It would be nice to also collect any variables that were sent to the resource when data was requested, so we can report back which instance/request/date/segment of the task failed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

2 participants