Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync table formats metadata to multiple catalogs. #590

Open
2 tasks done
vinishjail97 opened this issue Dec 6, 2024 · 2 comments
Open
2 tasks done

Sync table formats metadata to multiple catalogs. #590

vinishjail97 opened this issue Dec 6, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@vinishjail97
Copy link
Contributor

vinishjail97 commented Dec 6, 2024

Feature Request / Improvement

Context

Users of Apache XTable (Incubating) today can translate metadata across table formats (iceberg, hudi, and delta) and use the tables in different platforms depending on their choice. Today there's still some friction involved in terms of usability because users need to explicitly register the tables in the catalog of their choice (glue, HMS, unity, bigLake etc.) and then use the catalog in the platform of their choice to do DDL, DML queries.

XTable is built on the principle of omni directional interoperability and I'm proposing an interface which allows syncing metadata of table formats to multiple catalogs in a continuous and incremental manner.

Why do we need this feature ?

  1. Reduce friction for XTable users - XTable sync will register the tables in the catalogs of their choice after metadata generation. If users are using a single format, they can still use XTable to sync the metadata across multiple catalogs.
  2. Avoid catalog lock-in - There's no reason why data/metadata in storage should be registered in a single catalog, users can register the table across multiple catalogs depending on the use-case, ecosystem and features provided by the catalog.

Implementation

I have submitted a PR with the interface for CatalogSync and CatalogSyncOperations, open to the feedback on the feature request.

DISCUSS thread in [email protected]
https://lists.apache.org/thread/654n5t5dvysf5bxph2cdm79s9tmz30vg

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@vinishjail97 vinishjail97 self-assigned this Dec 6, 2024
@vinishjail97 vinishjail97 added the enhancement New feature or request label Dec 6, 2024
@vinishjail97
Copy link
Contributor Author

Submitted the PR for interfaces -> #591

@purnachandergit
Copy link

This is a good feature, this will help alot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants