You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge?
While working on #6800 / #4973 I noticed that the SUM accumulators intermediate state contains two fields:
sum: Native
count: usize
Maintaining this state is on the hit path of the aggregates (e.g. TPCH Q1 has 4 of sums) thus reducing the work is performance critical
The counts are only needed for the retractable version of the accumulator (aka sliding accumulator used in window functions) but they are actually carried through all implementations.
Is your feature request related to a problem or challenge?
While working on #6800 / #4973 I noticed that the SUM accumulators intermediate state contains two fields:
Maintaining this state is on the hit path of the aggregates (e.g. TPCH Q1 has 4 of
sum
s) thus reducing the work is performance criticalThe counts are only needed for the retractable version of the accumulator (aka sliding accumulator used in window functions) but they are actually carried through all implementations.
Sum accumulator is here https://github.com/apache/arrow-datafusion/blob/main/datafusion/physical-expr/src/aggregate/sum.rs
Describe the solution you'd like
I would like to remove the counts field in the sum accumulator (or only use it when we have sliding windows)
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: