Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up and test sample metadata barplot computations #313
Speed up and test sample metadata barplot computations #313
Changes from 19 commits
ba65a87
c0e023c
9703416
5a5479b
13e0272
898a0b3
32174c0
d10c4ed
b59ba6e
e541686
c1440e6
3cb4771
65dfcc2
f08638d
e91538a
7083d67
3735832
1ece062
f685f99
7890a17
2b8626d
5fa827b
dc6476b
3aeec27
d2b7d37
80770a8
472a248
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since _tbl is a 2D array of samples by features, this may make more sense to developers who are not to familiar with biom-table. Minor suggestion feel free to ignore. (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's ok I'd like to keep this as is -- there are a couple of other places in the BIOM table JS where
presentFeatureIndices
is used when iterating over the table in this way. I added some context to the comment before this loop, so hopefully that makes things clearer as a compromise :) You're totally correct, though -- we should make this code as non-intimidating as possible, since these functions are all getting used pretty frequently ...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a note: It might be less memory expensive/more efficient to iterate through it this was 2D array especially for large trees. Storing (even temporarily) an object that uses fId's as keys can be quite memory intensive + js has a bit more (although not a ton) of overhead compared to arrays when iterating. I would keep this the same for now and we can come back to this if memory becomes a problem when working with large trees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood and agreed. Added a TODO to the code detailing this optimization.
One weird-ish thing is that none of the Empress JS code besides the BIOM table knows anything about feature indices -- it only stores information about feature names (aka IDs), as far as I know. Moving the index stuff to empress.js / treeData might be a hassle, but I think it could save a decent amount of space -- I guess in-the-BIOM-table tip names are technically stored in two places (treeData and the BIOM table).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should either let the user know that the sample stack bar plots are displaying proportions instead of counts or we should just display the counts (or all the user to choose).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point -- and supporting displaying "variable-length" stacked barplots (where e.g. 1 sample has a given fixed length, so a tip present in 2 samples vs. a tip present in 20 samples will display differently) would be a cool feature to add. I've added an issue for this at #322, and updated the README to be clearer about proportions being displayed.