Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of DataCollectionFragment #3008

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

shobhitagarwal1612
Copy link
Member

@shobhitagarwal1612 shobhitagarwal1612 commented Jan 16, 2025

Fixes #2958
Fixes #2993

In previous PRs, we created TaskDataHandler to maintain state of TaskData and TaskSequenceHandler for generating task sequence. TaskSequenceHandler not only is a wrapper for operations related to task sequence but also caches the sequence internally for better performance.

In this PR, we aim to replace custom creation of task sequence in DataCollectionViewModel using TaskSequenceHandler. This would not only abstract out all task sequence related operations but also greatly improve the overall performance and jankiness of the data collection process. Currently, we re-compute the task sequence upto 5-7 times during each screen. This number increases with each screen as the stored values also keeps increasing. Task sequence computation is a heavy operation as it requires filtering tasks based on previously selected values.

This PR shouldn't change the behavior of the app other than improving the performance.

Ran the app locally and submitted multiple jobs to ensure no regressions are introduced. Also, added extensive unit tests for each helper class with core business logic for test coverage.

@gino-m @sufyanAbbasi PTAL?

@shobhitagarwal1612 shobhitagarwal1612 changed the title Improve performance of data collection by using TaskSequenceHandler in DataCollectionViewModel Improve performance by using TaskSequenceHandler in DataCollectionViewModel Jan 16, 2025
@shobhitagarwal1612 shobhitagarwal1612 changed the title Improve performance by using TaskSequenceHandler in DataCollectionViewModel Improve performance of DataCollectionFragment Jan 16, 2025
Copy link

codecov bot commented Jan 18, 2025

Codecov Report

Attention: Patch coverage is 87.87879% with 4 lines in your changes missing coverage. Please review.

Project coverage is 63.46%. Comparing base (a0975c3) to head (5b5ef4c).

Files with missing lines Patch % Lines
...round/ui/datacollection/DataCollectionViewModel.kt 86.66% 1 Missing and 3 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #3008      +/-   ##
============================================
- Coverage     63.55%   63.46%   -0.09%     
+ Complexity     1267     1258       -9     
============================================
  Files           270      270              
  Lines          6560     6539      -21     
  Branches        927      924       -3     
============================================
- Hits           4169     4150      -19     
+ Misses         1786     1783       -3     
- Partials        605      606       +1     
Files with missing lines Coverage Δ
...ground/ui/datacollection/DataCollectionFragment.kt 38.54% <ø> (-0.64%) ⬇️
...ndroid/ground/ui/datacollection/TaskDataHandler.kt 93.75% <100.00%> (+0.41%) ⬆️
...id/ground/ui/datacollection/TaskSequenceHandler.kt 100.00% <100.00%> (ø)
...round/ui/datacollection/DataCollectionViewModel.kt 77.01% <86.66%> (-1.79%) ⬇️

@shobhitagarwal1612 shobhitagarwal1612 force-pushed the ashobhit/2958/integrate-sequence-handler-latest branch from 75335d5 to d55ec01 Compare January 18, 2025 11:06
@shobhitagarwal1612 shobhitagarwal1612 requested review from gino-m and sufyanAbbasi and removed request for gino-m January 18, 2025 11:12
@shobhitagarwal1612 shobhitagarwal1612 marked this pull request as ready for review January 18, 2025 11:12
@auto-assign auto-assign bot requested a review from scolsen January 18, 2025 11:12
Copy link
Contributor

@sufyanAbbasi sufyanAbbasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much! The new class is incredibly clean and the helper methods improve the readability greatly! I just had a couple of nits and question about implementation, but otherwise LGTM!

I wanted to call out that Sequences are designed to be lazily evaluated to reduce the overhead of regenerating the list every time, but perhaps I didn't implement it right the first time because I was reprocessing the Tasks list every time 🤦‍♂️:
https://kotlinlang.org/docs/sequences.html#sequence

Therefore, I think that the extra complexity in TaskSequenceHandler that initializes and refreshes the sequence can be made slightly simpler without losing any performance.

Instead, consider caching ONLY the Sequence<TaskSelection> which correlates to the user's selection, and we can still pass in the theoretical TaskSelection when computing the test sequence.

So the method, getTaskSequence(Sequence<TaskSelection>?) (which uses the cached sequence if unspecified, otherwise the test one), will return a new Sequence<Task> every time, but this will always be an O(1) operation (IIUC from reading the Kotlin documentation).

I guess this is just an implementation detail, but my intention is to reduce some cognitive overhead by focusing on the independent variable here, Sequence<TaskSelection> rather than the thing we dynamically generate (the task sequence), and some of the methods can be greatly simplified.

Anyway, just food for thought! LGTM!

condition: Condition,
taskValueOverride: Pair<String, TaskData?>? = null,
): Boolean = condition.fulfilledBy(taskDataHandler.getTaskSelections(taskValueOverride))
// Don't call `refreshTaskSequence` as the new value hasn't been set yet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Consider adding a new method to TaskSequenceHandler called testSequence(taskDataHandler, taskValueOverride) which encapsulates this logic. I feel like if a developer can accidentally mess it up by using the wrong method, then it's better to make it a method with proper documentation.

/** Returns true if the given [taskId] is last task in sequence. */
fun isLastPosition(taskId: String): Boolean = taskId == getTaskSequence().last().id
fun isLastPositionWithValue(taskId: String, newValue: TaskData?): Boolean {
if (taskDataHandler.getData(taskId) == newValue) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to make sure I understand the logic here, so I'll walk through an example. So say I'm on the "last" task with a select of three options: A, B, and C, with two more conditional tasks ahead of it, Conditional Task A, and Conditional Task B (C will show "Done").

The user changes the selection to A, so setOnValueChanged on the button is called and this function is invoked, so Condition A is now the last task. They click next, we refresh the sequence, and save it to the task's data. Then the user hits back.

So now the user selects B, so because this case is not true, we return false from the next part, or if they select C, the next part returns true.

Now when we select A again, we fall into this if case and we reuse the last sequence generated, so isLast() will always return false?

If so, maybe we can change the comment to Reuse the existing task sequence if the value has already been saved (i.e. after pressing "Next" and going back).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants