-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unify Row hash and hash implementation #4924
Unify Row hash and hash implementation #4924
Conversation
…synnada-ai/arrow-datafusion into feature/row_accumulator_utilize # Conflicts: # datafusion/core/src/physical_plan/aggregates/hash.rs # datafusion/core/src/physical_plan/aggregates/mod.rs # datafusion/core/src/physical_plan/aggregates/row_hash.rs
Sending another commit seems to have triggered it. |
Thank you @mustafasrepo - I will put this on my review queue for tomorrow. Sounds awesome cc @crepererum and @tustvold |
Can we have a benchmark run for this? Basically run |
/// Aggregate expressions not supporting row accumulation | ||
normal_aggregate_expressions: Vec<Vec<Arc<dyn PhysicalExpr>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we have one operator now but still two accumulator implementations? I think think this counts as "closing the ticket" though. We still have massive code duplication in accumulator implementations. Furthermore the implementation of this specific stream gets more complicated, but I agree that this would be one possible path forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, one operator but multiple accumulators. Having a single accumulator implementation without performance impact is a tough goal and there seems to be some open questions around it. For example, some accumulation tasks involve non-fixed states, how do we implement them in the row-accumulator style?
Until we see a clear path forward in that regard, I think this is a good first step towards full unification (if such a thing is possible).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #2723 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I.e. I'm personally convinced that the row accumulator style is a dead end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I obviously need some time to digest the V3 plan, but it looks reasonable at a first reading. We will be happy help chipping away at it over time if you create a task-level epic out of it. And this PR would be a good intermediate-term simplification until then.
In this sense, I agree that this PR doesn't close the ticket fully. @mustafasrepo, can you change PR text to say "Make progress on #2723" instead of saying closes? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try and help organize the tickets some more tomorrow morning. Thanks for this discussion
I updated PR body to include benchmark result(against master branch). Thanks for pointing this out. |
Looks good, mostly noise I guess (and even if we suffer a 10% hit, I would take that for the simpler implementation). |
my reading of the benchmarks was that some got 10% faster as well. Am I reading it wrong? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @mustafasrepo and @ozankabak -- I got through some of this PR today but I was not able to finish the review and have run out of time. I will complete my review first thing tomorrow
@@ -151,15 +149,13 @@ impl PhysicalGroupBy { | |||
|
|||
enum StreamType { | |||
AggregateStream(AggregateStream), | |||
GroupedHashAggregateStreamV2(GroupedHashAggregateStreamV2), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is so wonderful to see
@@ -202,7 +202,9 @@ impl RowAccumulator for CountRowAccumulator { | |||
} | |||
|
|||
fn evaluate(&self, accessor: &RowAccessor) -> Result<ScalarValue> { | |||
Ok(accessor.get_as_scalar(&DataType::Int64, self.state_index)) | |||
Ok(ScalarValue::Int64(Some( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this change necessary or is it a drive by refactor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drive-by mini bug fix if I am remembering correctly. I believe before this change, we were getting a NULL where we were supposed to get a zero in some use case. @mustafasrepo, am I remembering correctly? If so, we should add a mini-test for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the group, where CountRowAccumulator
is working is empty. Without this change it produces NULL
, However, it should produce 0
. I cannot produce an example for this case. Since row_accumulator only works when query contains GROUP BY
, and GROUP BY
cannot produce empty group. However, as an example
SELECT count(*) as cnt
FROM aggregate_test_100
WHERE 1 = 0;
will produce (where CountAccumulator
is used)
"+-----+",
"| cnt |",
"+-----+",
"| 0 |",
"+-----+",
. However, if it were to work with CountRowAccumulator
. It would produce
"+-----+",
"| cnt |",
"+-----+",
"| |",
"+-----+",
which is wrong. This change fixes this bug.
In summary, there is no way to reproduce this error (As far as I know) in current implementation, since row_accumulator
is used only when query contains GROUP BY
clause. However, if we were to use row_accumulator
in non-group cases we can encounter this issue (I recognized this behavior when experimenting RowAccumulator
support for non-group by aggregation). By the way, below query would
SELECT count(*)
FROM aggregate_test_100
WHERE 1 = 0
GROUP BY ();
reproduce the issue. However, it is not supported in datafusion currently.
It got faster in some cases but a little slower in others. Overall, performance is similar I think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through this carefully -- great work @mustafasrepo and @ozankabak
cc @yjshen and @richox who I think worked on the group row format
cc @Dandandan who has worked in this code area before
/// keeps range for each accumulator in the field | ||
/// first element in the array corresponds to normal accumulators | ||
/// second element in the array corresponds to row accumulators | ||
indices: [Vec<Range<usize>>; 2], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The need to keep two lists of accumulators is quite unfortunate (maybe the code would be simpler if it were in a single Enum or behind a trait). However, I think this implementation is better than what we have on the master branch because it at least only has duplication with the aggregators rather than the entire GroupHash structure
During review of this code it was clear it was in need of some more cleanup -- PR proposing to do so in #4972 |
I will plan to merge this PR tomorrow unless there are any other comments |
#4973 looks good to me, we will help with migrating once the foundational tools are in place. |
Thanks again -- this is going to be great! |
Benchmark runs are scheduled for baseline = e6a0500 and contender = 96cf046. 96cf046 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #2723
re #4973
Rationale for this change
Currently there are two very similar implementation for GroupedHashAggregators. Implementation changes according whether all aggregators support row format or not. We can decrease code duplication by combining implementations.
What changes are included in this PR?
This PR groups aggregators supporting row format and not supporting. It calculates aggregator results for corresponding aggregators. Then writes the result to the final place in the schema. This has several advantages. It decreases code duplication. It also can utilize row aggregation even if all aggregators do not support row aggregation. For instance, consider the query below
We can calculate result of
MIN(c9)
,MAX(c9)
with row accumulator, andMIN(c13)
,MAX(c13)
with normal accumulator (where c9 is a primitive type that supports row accumulation, c13 is a complex type that doesn't support row accumulation). After generating results we can write the result to appropriate index to align with schema.MIN(c9)
,MAX(c9)
will be written to indices 1 and 3, similarlyMIN(c13)
,MAX(c13)
will be written to indices 0 and 2 at the final record batch.Are these changes tested?
They are covered by existing tests. Since this is basically a refactor.
I ran a benchmark against master branch. Result of the benchmark can be seen below
Are there any user-facing changes?