Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] support push down agg distinct limit #55455

Merged
merged 2 commits into from
Jan 27, 2025

Conversation

stdpain
Copy link
Contributor

@stdpain stdpain commented Jan 26, 2025

Why I'm doing:

select distinct upper(lo_orderkey),lo_linenumber from lineorder limit 10;
CPU mem
baseline 30.70 sec 45.454G
patched 21ms 12.430 MB

What I'm doing:

In this patch, we support pushing the distinct limit down to the streaming agg.
The streaming agg will force a preagg pattern for the pushed limit. (Otherwise the limit may lead to undesired results, e.g., a duplicate value in the streaming section reaches the limit limit).

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.4
    • 3.3
    • 3.2
    • 3.1
    • 3.0

@stdpain stdpain force-pushed the support_agg_distinct_limit branch from 377708b to 160d3b5 Compare January 26, 2025 08:00
Seaven
Seaven previously approved these changes Jan 26, 2025
satanson
satanson previously approved these changes Jan 26, 2025
Signed-off-by: stdpain <[email protected]>
@stdpain stdpain dismissed stale reviews from satanson and Seaven via 3465b0b January 26, 2025 08:54
Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[FE Incremental Coverage Report]

pass : 8 / 8 (100.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/qe/SessionVariable.java 2 2 100.00% []
🔵 com/starrocks/sql/optimizer/rule/transformation/SplitTwoPhaseAggRule.java 5 5 100.00% []
🔵 com/starrocks/sql/plan/PlanFragmentBuilder.java 1 1 100.00% []

Copy link

[BE Incremental Coverage Report]

pass : 28 / 30 (93.33%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 be/src/exec/aggregate/agg_hash_set.h 9 11 81.82% [96, 132]
🔵 be/src/exec/pipeline/aggregate/aggregate_distinct_streaming_sink_operator.cpp 13 13 100.00% []
🔵 be/src/exec/pipeline/aggregate/spillable_aggregate_blocking_sink_operator.cpp 2 2 100.00% []
🔵 be/src/exec/aggregate/aggregate_blocking_node.cpp 3 3 100.00% []
🔵 be/src/exec/aggregate/agg_hash_variant.cpp 1 1 100.00% []

@Seaven Seaven merged commit 4f45265 into StarRocks:main Jan 27, 2025
50 checks passed
Copy link

@Mergifyio backport branch-3.4

Copy link

@Mergifyio backport branch-3.3

Copy link

@Mergifyio backport branch-3.2

@github-actions github-actions bot removed the 3.2 label Jan 27, 2025
Copy link

@Mergifyio backport branch-3.1

Copy link

@Mergifyio backport branch-3.0

@github-actions github-actions bot removed the 3.1 label Jan 27, 2025
Copy link
Contributor

mergify bot commented Jan 27, 2025

backport branch-3.4

✅ Backports have been created

@github-actions github-actions bot removed the 3.0 label Jan 27, 2025
Copy link
Contributor

mergify bot commented Jan 27, 2025

backport branch-3.3

✅ Backports have been created

Copy link
Contributor

mergify bot commented Jan 27, 2025

backport branch-3.2

✅ Backports have been created

Copy link
Contributor

mergify bot commented Jan 27, 2025

backport branch-3.1

✅ Backports have been created

Copy link
Contributor

mergify bot commented Jan 27, 2025

backport branch-3.0

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Jan 27, 2025
Signed-off-by: stdpain <[email protected]>
(cherry picked from commit 4f45265)

# Conflicts:
#	be/src/exec/pipeline/aggregate/aggregate_distinct_streaming_sink_operator.cpp
mergify bot pushed a commit that referenced this pull request Jan 27, 2025
Signed-off-by: stdpain <[email protected]>
(cherry picked from commit 4f45265)

# Conflicts:
#	be/src/exec/aggregate/agg_hash_set.h
#	be/src/exec/pipeline/aggregate/aggregate_distinct_streaming_sink_operator.cpp
mergify bot pushed a commit that referenced this pull request Jan 27, 2025
Signed-off-by: stdpain <[email protected]>
(cherry picked from commit 4f45265)

# Conflicts:
#	be/src/exec/aggregate/agg_hash_set.h
#	be/src/exec/pipeline/aggregate/aggregate_distinct_streaming_sink_operator.cpp
#	fe/fe-core/src/main/java/com/starrocks/qe/SessionVariable.java
mergify bot pushed a commit that referenced this pull request Jan 27, 2025
Signed-off-by: stdpain <[email protected]>
(cherry picked from commit 4f45265)

# Conflicts:
#	be/src/exec/aggregate/agg_hash_set.h
#	be/src/exec/aggregate/aggregate_blocking_node.cpp
#	be/src/exec/pipeline/aggregate/aggregate_blocking_sink_operator.h
#	be/src/exec/pipeline/aggregate/aggregate_distinct_streaming_sink_operator.cpp
#	fe/fe-core/src/main/java/com/starrocks/qe/SessionVariable.java
mergify bot pushed a commit that referenced this pull request Jan 27, 2025
Signed-off-by: stdpain <[email protected]>
(cherry picked from commit 4f45265)

# Conflicts:
#	be/src/exec/aggregate/agg_hash_set.h
#	be/src/exec/aggregate/agg_hash_variant.cpp
#	be/src/exec/aggregate/agg_hash_variant.h
#	be/src/exec/aggregate/aggregate_blocking_node.cpp
#	be/src/exec/pipeline/aggregate/aggregate_blocking_sink_operator.h
#	be/src/exec/pipeline/aggregate/aggregate_distinct_streaming_sink_operator.cpp
#	be/src/exec/pipeline/aggregate/aggregate_distinct_streaming_sink_operator.h
#	be/src/exec/pipeline/aggregate/spillable_aggregate_blocking_sink_operator.cpp
#	fe/fe-core/src/main/java/com/starrocks/qe/SessionVariable.java
#	fe/fe-core/src/main/java/com/starrocks/sql/optimizer/rule/transformation/SplitTwoPhaseAggRule.java
#	fe/fe-core/src/test/java/com/starrocks/sql/plan/AggregateTest.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants