feat: add additional counting aggregations to `AggBy` #6358

lbooker42 · 2024-11-08T22:54:54Z

New aggregations are:

AggCountNonNull()
AggCountNull()
AggCountNegative()
AggCountPositive()
AggCountZero()
AggCountNaN()
AggCountInfinite()
AggCountFinite()

devinrsmith · 2024-11-14T18:21:49Z

table-api/src/main/java/io/deephaven/api/agg/spec/AggSpecCountValues.java

+    @Parameter
+    public abstract AggCountType countType();


I'll post the same sort of concerns here that I have for #6270; I think we would do better to express this in terms of a Filter. What if I want to count NULL or NAN or INFINITE? Or any combination?

Interesting idea, would prefer to see this as a new feature suggestion.

These agg_by ops are companions to Numeric / Basic vector ops (and will be paralleled by update_by cum_count and rolling_count variants) and this filter idea isn't trivial to extend to all these routines.

lbooker42 · 2024-11-20T20:01:52Z

Note on extensive use of lambda and performance

I ran repeated tests on 4 different VM (arm64 on Mac M1) and verified that the lambda-heavy ops are not penalized and the lambda are (apparently) in-lined by the JIT.

The comparison is with AggSum that uses type-specialized operators but evaluates every value, AggCountNonNull that uses lambdas and evaluates every value, and AggCountAll (which aliases AggCount) that evaluates no values, simply returning the size of the bucket rowset.

The expected (and confirmed) result is that the new counting ops will fall between AggSum and AggCountAll/AggCount in performance.

engine/table/src/main/java/io/deephaven/engine/table/impl/by/BaseChunkedCountOperator.java

proto/proto-backplane-grpc/src/main/proto/deephaven/proto/table.proto

py/server/deephaven/agg.py

table-api/src/main/java/io/deephaven/api/agg/util/AggCountType.java

…values and implementations.

engine/table/src/main/java/io/deephaven/engine/table/impl/by/BaseChunkedCountOperator.java

…aintained.

jmao-denver

Python changes LGTM

Initial commit for AggCount additional operators.

1e21206

lbooker42 self-assigned this Nov 8, 2024

lbooker42 added the query engine label Nov 8, 2024

lbooker42 added this to the 0.38.0 milestone Nov 8, 2024

lbooker42 added DocumentationNeeded ReleaseNotesNeeded Release notes are needed labels Nov 8, 2024

Added python server and GRPC client support.

94eddb2

lbooker42 requested review from devinrsmith, nbauernfeind, niloc132, rcaudy, chipkent and jmao-denver as code owners November 13, 2024 23:45

Correct build failure.

67e49ee

devinrsmith reviewed Nov 14, 2024

View reviewed changes

lbooker42 mentioned this pull request Nov 14, 2024

Additional operator for update_by requested #5709

Open

lbooker42 added 5 commits November 15, 2024 08:31

Merge branch 'main' into lab-agg-count

4db4e5d

Added agg_by count_all alias to count_.

ca3cad2

Relocated AggCountType to be useful for multiple count operations.

03d01be

Unifying the GRPC among the count types.

3b61be1

Merged with main.

85f08dd

cpwright reviewed Nov 20, 2024

View reviewed changes

lbooker42 added 4 commits November 20, 2024 13:40

Correct TestAggBy incorrect column generators for 'charCol'.

776f1cd

Added CountNonNZero, CountNonNegative, CountNonPositive AggCountType …

890e474

…values and implementations.

Merged with main.

63530da

Doc changes and additional tests.

3d0cde8

cpwright reviewed Nov 21, 2024

View reviewed changes

engine/table/src/main/java/io/deephaven/engine/table/impl/by/BaseChunkedCountOperator.java Outdated Show resolved Hide resolved

Updated CountOperator modify chunk to assert non-negative count are m…

68bfa32

…aintained.

cpwright approved these changes Nov 21, 2024

View reviewed changes

jmao-denver approved these changes Nov 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add additional counting aggregations to `AggBy` #6358

feat: add additional counting aggregations to `AggBy` #6358

lbooker42 commented Nov 8, 2024

devinrsmith Nov 14, 2024

lbooker42 Nov 20, 2024

lbooker42 commented Nov 20, 2024 •

edited

Loading

jmao-denver left a comment

feat: add additional counting aggregations to AggBy #6358

Are you sure you want to change the base?

feat: add additional counting aggregations to AggBy #6358

Conversation

lbooker42 commented Nov 8, 2024

devinrsmith Nov 14, 2024

Choose a reason for hiding this comment

lbooker42 Nov 20, 2024

Choose a reason for hiding this comment

lbooker42 commented Nov 20, 2024 • edited Loading

Note on extensive use of lambda and performance

jmao-denver left a comment

Choose a reason for hiding this comment

feat: add additional counting aggregations to `AggBy` #6358

feat: add additional counting aggregations to `AggBy` #6358

lbooker42 commented Nov 20, 2024 •

edited

Loading