-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add custom iceberg metadata cleaner #621
base: main
Are you sure you want to change the base?
Conversation
.expireOlderThan( | ||
Instant.now().minus(snapshotRetentionInHours, ChronoUnit.HOURS).toEpochMilli()) | ||
.cleanExpiredFiles(!useInternalIcebergCleaner); // is internal cleaner is enabled, disable iceberg cleaner | ||
List<Snapshot> removedSnapshots = expireSnapshots.apply(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expiredSnapshots.apply()
will apply the changes and returns list of snapshots that will be deleted based on provided parameters without committing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does expiredSnapshots.apply()
do a a list for the metadata folder for this ? I was trying to understand how expensive this could be for example if we retained 2 days of manifests assuming we have a commit every 5min.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this will only operate on TableMetadata to compute the snapshots needs to be retained based on the provided retention time. Metadata file listing happens only as part of cleaning.
// only remove files that were deleted in an ancestor of the current table state to avoid | ||
// physically deleting files that were logically deleted in a commit that was rolled back. | ||
Set<Long> ancestorIds = | ||
Sets.newHashSet(SnapshotUtil.ancestorIds(latest, snapshotLookup(removedSnapshots, table))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To fetch the ancestors of table before expiring snapshots, we assume total table snapshots before committing = total table snapshots after committing + removed snapshots
transaction.commitTransaction(); | ||
// after commit is complete, clean up the manifest files | ||
if (useInternalIcebergCleaner) { | ||
cleanExpiredSnapshots(removedSnapshots); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clean up expired snapshots, we're relying on table's current state and list of removed snapshots.
Another approach would be compare metadata before expire snapshots is committed and after. This would give us the exact changes expire snapshots commit has made (Iceberg uses this approach).
But the problem with this approach is how to get the intermediate TableMetadata before commit is made, since Transaction class doesn't expose intermediate state before transaction is committed. One way is casting Transaction to BaseTransaction which exposes TransactionTableOperations which provides intermediate TableMetadata
TableMetadata beforeExpiration = ((BaseTransaction) transaction).underlyingOps().refresh();
transaction
.expireSnapshots()
.expireOlderThan(
Instant.now().minus(snapshotRetentionInHours, ChronoUnit.HOURS).toEpochMilli())
.deleteWith(this::safeDelete) // ensures that only metadata files are deleted
.cleanExpiredFiles(true)
.commit();
TableMetadata afterExpiration = ((BaseTransaction) transaction).underlyingOps().refresh();
// do metadata clean based on before and after expiration metadata
cleanUp(beforeExpiration, afterExpiration)
private void cleanExpiredSnapshots(List<Snapshot> removedSnapshots) { | ||
IcebergMetadataCleanupStrategy cleanupStrategy = new IcebergMetadataFileCleaner(transaction.table().io(), deleteExecutorService, planExecutorService, this::safeDelete); | ||
cleanupStrategy.cleanFiles(table, removedSnapshots); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be run in async to further improve the performance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @vamsikarnika
Could you please provide more details about the motivation behind this change and which table formats it impacts? At the moment, it seems to be specific to Iceberg and the some more context would be helpful.
Important Read
What is the purpose of the pull request
(For example: This pull request implements the sync for delta format.)
Brief change log
(for example:)
Verify this pull request
(Please pick either of the following options)
This pull request is a trivial rework / code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)