Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(datastore): support checkpoint in server side #3093

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

jialeicui
Copy link
Contributor

@jialeicui jialeicui commented Dec 20, 2023

Description

Provides support for checkpoints, which later allows:

  1. Only needing to save the data version indicated by the checkpoint, effectively reducing the disk and memory space occupied by the data
  2. Fetch the statistical information of the data pointed by a checkpoint with constant complexity, including data row count, minimum key, maximum key, etc.

Issues to be considered: #3098

  1. It is necessary to support old versions of data, and a migration logic is needed to ensure that all historical data checkpoints can be correctly processed.
  2. Consider whether a Controller can rollback after an upgrade (data versions might have been garbage collected due to the checkpoint function).

Modules

  • UI
  • Controller
  • Agent
  • Client
  • Python-SDK
  • Others

Checklist

  • run code format and lint check
  • add unit test
  • add necessary doc

@jialeicui jialeicui marked this pull request as draft December 20, 2023 09:37
Copy link

codecov bot commented Dec 20, 2023

Codecov Report

Attention: 20 lines in your changes are missing coverage. Please review.

Comparison is base (cf58e90) 82.68% compared to head (ff4446e) 82.74%.

Files Patch % Lines
...tarwhale/mlops/datastore/impl/MemoryTableImpl.java 89.18% 6 Missing and 10 partials ⚠️
.../java/ai/starwhale/mlops/datastore/Checkpoint.java 87.50% 0 Missing and 2 partials ⚠️
...n/java/ai/starwhale/mlops/datastore/DataStore.java 90.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3093      +/-   ##
============================================
+ Coverage     82.68%   82.74%   +0.06%     
- Complexity     3182     3232      +50     
============================================
  Files           573      575       +2     
  Lines         31969    32181     +212     
  Branches       1865     1897      +32     
============================================
+ Hits          26433    26628     +195     
- Misses         4712     4715       +3     
- Partials        824      838      +14     
Flag Coverage Δ
console 72.09% <ø> (ø)
controller 73.55% <89.94%> (+0.19%) ⬆️
standalone 91.89% <100.00%> (+0.03%) ⬆️
unittests 91.63% <100.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant