Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime: Support persistent storage for distributed workers #3854

Open
1 task
begelundmuller opened this issue Jan 16, 2024 · 0 comments
Open
1 task

Runtime: Support persistent storage for distributed workers #3854

begelundmuller opened this issue Jan 16, 2024 · 0 comments
Assignees

Comments

@begelundmuller
Copy link
Contributor

begelundmuller commented Jan 16, 2024

As we look toward separating ETL and serving, and running ETL on worker nodes, the runtime will need the ability to persist ETL state in GCS.

Requirements:

  • Use a system-wide GCS connection for ETL state for all instances (to make configuration easier and to prevent corruption).
  • Ensure complete isolation of data between instances
  • Ability to track/limit usage per instance
  • On local, ability to use a sub-directory of the tmp directory for ETL state instead of GCS

Tasks:

  • Design proposal (one idea is to leverage SystemConnectors and configure it to a local file driver on local and GCS on cloud)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants