-
Notifications
You must be signed in to change notification settings - Fork 598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(storage): support non_pk_prefix_watermark state cleaning #19889
base: main
Are you sure you want to change the base?
Changes from 21 commits
605f235
501d374
3544c0e
d1a39a8
7c3f521
e3dbc73
b71eff9
9e0af8e
74336d6
96de9ba
3127678
49a48ad
bb7a29b
6b0b295
3c23aa3
3113463
b2e158e
fd308de
bf28307
369d718
3500061
ef4c752
5ed4920
8130c61
7e8f6cd
ffded48
492b689
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -39,6 +39,7 @@ use itertools::Itertools; | |
use parking_lot::Mutex; | ||
use rand::seq::SliceRandom; | ||
use rand::thread_rng; | ||
use risingwave_common::catalog::TableId; | ||
use risingwave_common::util::epoch::Epoch; | ||
use risingwave_hummock_sdk::compact_task::{CompactTask, ReportTask}; | ||
use risingwave_hummock_sdk::compaction_group::StateTableId; | ||
|
@@ -728,6 +729,46 @@ impl HummockManager { | |
} | ||
} | ||
|
||
// Filter out the table that has a primary key prefix watermark. | ||
let table_id_with_pk_prefix_watermark: HashSet<_> = self | ||
.metadata_manager | ||
.catalog_controller | ||
.get_table_by_ids( | ||
version | ||
.latest_version() | ||
.table_watermarks | ||
.keys() | ||
.map(|id| id.table_id() as _) | ||
.collect(), | ||
) | ||
.await | ||
.map_err(|e| Error::Internal(e.into()))? | ||
.into_iter() | ||
.filter_map(|table| { | ||
// pk prefix watermark. | ||
if table.clean_watermark_index_in_pk.is_none() | ||
|| table.clean_watermark_index_in_pk.unwrap() == 0 | ||
Li0k marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{ | ||
Some(TableId::from(table.get_id())) | ||
} else { | ||
None | ||
} | ||
}) | ||
.collect(); | ||
|
||
let table_watermarks = version | ||
.latest_version() | ||
.table_watermarks | ||
.iter() | ||
.filter_map(|(table_id, table_watermarks)| { | ||
if table_id_with_pk_prefix_watermark.contains(table_id) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We already have a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, if we filter out non pk prefix watermark here, how can compactor retrieve the non pk prefix watermark? Based on the logic here, it seems that we rely on the fact that non pk prefix watermark is present in the compact task. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch , we should filter the watermark by WaterMarkType directly. And, the filtered results are only passed to the picker, while all relevant watermarks are passed to the compactor (pk or non-pk). |
||
Some((*table_id, table_watermarks.clone())) | ||
} else { | ||
None | ||
} | ||
}) | ||
.collect(); | ||
|
||
while let Some(compact_task) = compact_status.get_compact_task( | ||
version | ||
.latest_version() | ||
|
@@ -742,7 +783,7 @@ impl HummockManager { | |
selector, | ||
&table_id_to_option, | ||
developer_config.clone(), | ||
&version.latest_version().table_watermarks, | ||
&table_watermarks, | ||
&version.latest_version().state_table_info, | ||
) { | ||
let target_level_id = compact_task.input.target_level as u32; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we assert
idx == 1
here?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, row can be any length and index can be a generic function.