Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor: move resolve_overlap a method on OrderingEquivalenceClass #14138

Merged
merged 1 commit into from
Jan 16, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 22 additions & 16 deletions datafusion/physical-expr/src/equivalence/ordering.rs
Original file line number Diff line number Diff line change
Expand Up @@ -135,12 +135,12 @@ impl OrderingEquivalenceClass {
let mut ordering_idx = idx + 1;
let mut removal = self.orderings[idx].is_empty();
while ordering_idx < self.orderings.len() {
work |= resolve_overlap(&mut self.orderings, idx, ordering_idx);
work |= self.resolve_overlap(idx, ordering_idx);
if self.orderings[idx].is_empty() {
removal = true;
break;
}
work |= resolve_overlap(&mut self.orderings, ordering_idx, idx);
work |= self.resolve_overlap(ordering_idx, idx);
if self.orderings[ordering_idx].is_empty() {
self.orderings.swap_remove(ordering_idx);
} else {
Expand All @@ -156,6 +156,26 @@ impl OrderingEquivalenceClass {
}
}

/// Trims `orderings[idx]` if some suffix of it overlaps with a prefix of
/// `orderings[pre_idx]`. Returns `true` if there is any overlap, `false` otherwise.
///
/// For example, if `orderings[idx]` is `[a ASC, b ASC, c DESC]` and
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also added a small example

/// `orderings[pre_idx]` is `[b ASC, c DESC]`, then the function will trim
/// `orderings[idx]` to `[a ASC]`.
fn resolve_overlap(&mut self, idx: usize, pre_idx: usize) -> bool {
let length = self.orderings[idx].len();
let other_length = self.orderings[pre_idx].len();
for overlap in 1..=length.min(other_length) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I think this is a very hot path in the particular queries. as it is something like O(N^3) in the number of potential sort orders. I am still investigating

if self.orderings[idx][length - overlap..]
== self.orderings[pre_idx][..overlap]
{
self.orderings[idx].truncate(length - overlap);
return true;
}
}
false
}

/// Returns the concatenation of all the orderings. This enables merge
/// operations to preserve all equivalent orderings simultaneously.
pub fn output_ordering(&self) -> Option<LexOrdering> {
Expand Down Expand Up @@ -226,20 +246,6 @@ impl IntoIterator for OrderingEquivalenceClass {
}
}

/// Trims `orderings[idx]` if some suffix of it overlaps with a prefix of
/// `orderings[pre_idx]`. Returns `true` if there is any overlap, `false` otherwise.
fn resolve_overlap(orderings: &mut [LexOrdering], idx: usize, pre_idx: usize) -> bool {
let length = orderings[idx].len();
let other_length = orderings[pre_idx].len();
for overlap in 1..=length.min(other_length) {
if orderings[idx][length - overlap..] == orderings[pre_idx][..overlap] {
orderings[idx].truncate(length - overlap);
return true;
}
}
false
}

impl Display for OrderingEquivalenceClass {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "[")?;
Expand Down
Loading