Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance: reduce stats task cost by skipping ser/de #39568

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

tedxu
Copy link
Contributor

@tedxu tedxu commented Jan 24, 2025

See #37234

@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tedxu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot added approved size/XXL Denotes a PR that changes 1000+ lines. labels Jan 24, 2025
@mergify mergify bot added dco-passed DCO check passed. kind/enhancement Issues or changes related to enhancement labels Jan 24, 2025
Copy link
Contributor

mergify bot commented Jan 24, 2025

@tedxu go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 24, 2025

@tedxu E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 24, 2025

@tedxu cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

@tedxu tedxu force-pushed the enhance/skip_serde_stats_task branch from a12ccda to a80189f Compare January 24, 2025 03:47
Copy link
Contributor

mergify bot commented Jan 24, 2025

@tedxu E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link

codecov bot commented Jan 24, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69.40%. Comparing base (5fb597b) to head (be8cc56).
Report is 12 commits behind head on master.

❗ There is a different number of reports uploaded between BASE (5fb597b) and HEAD (be8cc56). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (5fb597b) HEAD (be8cc56)
2 1
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #39568       +/-   ##
===========================================
- Coverage   80.97%   69.40%   -11.57%     
===========================================
  Files        1408      302     -1106     
  Lines      198881    27065   -171816     
===========================================
- Hits       161041    18785   -142256     
+ Misses      32151     8280    -23871     
+ Partials     5689        0     -5689     
Components Coverage Δ
Client ∅ <ø> (∅)
Core 69.40% <ø> (-0.16%) ⬇️
Go ∅ <ø> (∅)

see 1113 files with indirect coverage changes

Copy link
Contributor

mergify bot commented Jan 24, 2025

@tedxu go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 26, 2025

@tedxu E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@tedxu
Copy link
Contributor Author

tedxu commented Jan 26, 2025

In this PR the sort cost fairly depends on the partial order of the original dataset. If the original dataset is ordered, the CPU and memory cost can be negligible. I've also tested the worst case, for 1M rows the cost is as follows (2seconds & 4.5GB):

BenchmarkSort/sort-8         	       1	2409850417 ns/op	4545456624 B/op	42113988 allocs/op

The worst cost is still an improvement compares to current implementation. The primary cost comes from the Slice operation in arrow, which is not easy to optimize in Milvus's current structure.

Copy link
Contributor

mergify bot commented Jan 26, 2025

@tedxu go-sdk check failed, comment rerun go-sdk can trigger the job again.

Signed-off-by: Ted Xu <[email protected]>
Copy link
Contributor

mergify bot commented Jan 26, 2025

@tedxu go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 26, 2025

@tedxu E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 26, 2025

@tedxu cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Signed-off-by: Ted Xu <[email protected]>
Copy link
Contributor

mergify bot commented Jan 28, 2025

@tedxu go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 28, 2025

@tedxu E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved dco-passed DCO check passed. kind/enhancement Issues or changes related to enhancement size/XXL Denotes a PR that changes 1000+ lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants