-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CommunityAggregatedFields #886
Conversation
Current dependencies on/for this PR:
This comment was auto-generated by Graphite. |
TODO: ensure the fields are updated in an efficient manner, add tests, doc strings, celery job etc. |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #886 +/- ##
==========================================
+ Coverage 92.21% 92.23% +0.02%
==========================================
Files 270 271 +1
Lines 7741 7778 +37
Branches 737 739 +2
==========================================
+ Hits 7138 7174 +36
- Misses 501 502 +1
Partials 102 102
☔ View full report in Codecov by Sentry. |
f78d19c
to
52e46d3
Compare
@MythicManiac ready for review. Since last you checked I made the batch_size optional, added babby's first celery task in a true monkey-see-monkey-do manner, and updated the linting warning commit. |
52e46d3
to
cece83c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments don't do a good job at outlining the important parts, so summarized:
Important
- If this PR were to be merged as-is and deployed, functionality currently in production would break. Address this e.g. by setting up a task schedule with a data migration and/or creating a migration that populates initial data with the current prod equivalent.
- Current implementation of the aggregation isn't very memory-usage friendly & will eventually hit resource limits in prod (if not immediately)
- Not as critical, but splitting the aggregation into two phases (one for creation, one for updating) should simplify the code quite a bit
django/thunderstore/community/migrations/0024_auto_20231004_0617.py
Outdated
Show resolved
Hide resolved
cece83c
to
8d2934b
Compare
be7a0f3
to
a0d0227
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly seems good. I'd not consider the issues I commented on blockers individually since they could be addressed e.g. in a separate PR, but combined together I feel this could use at least one more round of changes.
django/thunderstore/community/migrations/0025_init_aggregated_fields.py
Outdated
Show resolved
Hide resolved
django/thunderstore/community/migrations/0026_schedule_aggregated_fields_refresh.py
Show resolved
Hide resolved
Computationally heavy fields, package_count and download_count will be updated periodically by a background task from now on. They are stored in a separate model to make it clearer they're not regular fields. To save memory and improve parallelization, updates are done in steps: first the related object is created for all Communities that doesn't have one yet, and then the fields are updated for each Community separately. Existing queryset helpers are used to avoid duplicating the logic for which packages and versions should be included in the calculations. This means more database operations, but it was deemed a fair tradeoff since the updates are done as background tasks. The values should be normally accessed via .aggregated getter to avoid errors if the Community doesn't have the values computed yet. .aggregated_fields relation is also available e.g. when working with QuerySets. Refs TS-1850
The task is scheduled to run once per hour, on a minute mark that doesn't clash with other currently scheduled tasks. Refs TS-1850
a0d0227
to
b086bca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me 👍
No description provided.