Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pipeline] implement state merge logic in case of parallel processing or state restoration from destination #16

Open
4 of 5 tasks
rudolfix opened this issue Jun 15, 2022 · 0 comments
Assignees

Comments

@rudolfix
Copy link
Collaborator

rudolfix commented Jun 15, 2022

implement proper state management in Pipeline

  1. state as visible by the source is just a dictionary (read and write via indexer)
  2. each source iterator that requires state should operate on it's own namespace (Pipeline.state["namespace"] for explicit names or Pipeline.state for implicit names based ie. on name of calling module)
  3. writing to state may only happen after iterator has stopped. that must be detected and extraction must be aborted
  4. state is preserved atomically together with the results of the extraction
  5. if many iterators operating on the same state were returned, we have a merge strategy: maximum value of all conflicting writes is preserved (ie. https://docs.meltano.com/guide/integration#internal-state-merge-logic)

Implementation

  1. Consider locking state file when merging (ie. https://py-filelock.readthedocs.io/en/latest/index.html)
@rudolfix rudolfix self-assigned this Jun 15, 2022
@rudolfix rudolfix changed the title [singer] implement state management [pipeline] implement state management Jun 15, 2022
@rudolfix rudolfix changed the title [pipeline] implement state management [pipeline] implement state merge logic in case of parallel processing or state restoration from destination Jan 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

1 participant