Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix course-overviews sink not ordering blocks correctly #59

Merged
merged 2 commits into from
Jun 14, 2024

Conversation

bmtcril
Copy link
Contributor

@bmtcril bmtcril commented Jun 7, 2024

We've been using Modulestore's get_items, but even though it's often correct the ordering isn't guaranteed. The new method walks the course tree itself, then adds in the detached models separately.

Potential concerns:

This was already a pretty heavy operation, calling into Modulestore twice to walk all of the blocks may not be performant enough or cause issues in the LMS/CMS. It may be preferable to ignore detached blocks if we are not currently using them, but I wanted to preserve parity with what we've been doing unless/until there is an actual problem.

Closes: #61

Merge checklist:
Check off if complete or not applicable:

  • Version bumped
  • Changelog record added
  • Fixup commits are squashed away
  • Unit tests added/updated
  • Manual testing instructions provided
  • Noted any: Concerns, dependencies, migration issues, deadlines, tickets

We've been using Modulestore's get_items, but even though it's often correct the ordering isn't guaranteed. The new method walks the course tree itself, then adds in the detached models separately.
Copy link

github-actions bot commented Jun 7, 2024

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  platform_plugin_aspects
  __init__.py
  platform_plugin_aspects/sinks
  course_overview_sink.py
Project Total  

This report was generated by python-coverage-comment-action

# get_items call does not guarantee ordering. It does not return
# detached blocks, so we gather them separately below.
course_block = modulestore.get_course(
course_key, revision="rev-opt-published-only"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is that revision key coming from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It comes from here, where it has been unchanged for 10 yrs: https://github.com/openedx/edx-platform/blob/749e18bcee396755f411f46d79b25f42628af7d1/xmodule/modulestore/__init__.py#L68

I'll make it a local constant and document it better if Dave says this is an ok direction, but I'd rather not import it from modulestore since that means a shim function for the import, mocking it in tests, etc, which doesn't seem worth it since it would only guard against the very unlikely change in the string value.

course_block = modulestore.get_course(
course_key, revision="rev-opt-published-only"
)
items = self.get_xblocks_recursive(course_block)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulling all of a course's blocks into memory like this might not be a good idea..

Holding just the parts of the blocks that we need (usage key, display_name, ...) would be more memory efficient than holding the whole block(s).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is related to this PR, but I'm seeing Course Overview and XBlock sink events in the logs when I just add a section or subsection to a course, but I haven't hit Publish yet. Are these structure blocks are automatically "published"?

I mention it because it seems to block the "new section/subsection" request response in Studio, which was already noticeably slow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can refactor it to do some kind of generator thing to avoid holding everything in memory.

If this is getting triggered that should mean that COURSE_PUBLISHED for fired, but I'd guess the slowness you're seeing is in dev where all of the heavyweight Celery tasks are happening in-process. I wouldn't expect that slowness to be a thing that happens in a production setting. It's sub-second on the demo course for me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pomegranited I took a stab at the memory consumption in the latest commit, but I'm not sure how much it helps. Depends on modulestore internals that I don't fully understand yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent -- serializing the blocks down to minimal properties should help, thank you!

but I'd guess the slowness you're seeing is in dev where all of the heavyweight Celery tasks are happening in-process

Oh absolutely, my poor little devstack isn't representative, just a good canary.

This attempts to keep memory usage down by keeping copies of the blocks around for as little time as possible.
Copy link
Contributor

@pomegranited pomegranited left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Working great for me, thanks @bmtcril

  • I tested this on my Aspects tutor dev stack, watched data sync to clickhouse whenever course structure is changed or a course is published.
  • I read through the code, and checked that the tests validate this change.
  • I checked for accessibility issues N/A
  • Includes documentation N/A
  • User-facing strings are extracted for translation N/A

@bmtcril bmtcril merged commit 6eab1af into main Jun 14, 2024
10 checks passed
@bmtcril bmtcril deleted the bmtcril/fix_xblock_serialization_order branch June 14, 2024 12:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Course overview sink sends incorrect sort order
4 participants