Barplots in the circular / rectangular layout #201

fedarko · 2020-06-23T02:07:16Z

This was mentioned in #97 (which has since been closed, since the focus of that was on the circular layout).

Now that the circular layout is implemented and tested, supporting visualizing tip-level feature metadata as barplots would be a really cool feature to add. This could be useful for a few different types of feature metadata, ranging from Songbird/ALDEx2/... differentials (or other "importance scores") to taxonomy annotation confidence values, etc.

fedarko · 2020-06-24T20:13:15Z

Also, it'd be cool to optionally support visualizing information passed over from Emperor as barplots -- it could be really useful to see e.g. presence information as tip-level information, while maintaining previous coloring of the tree (e.g. by feature metadata). Biologically, this would be a way of showing what particular taxa are unique to which groups of selected samples, or something along those lines.

fedarko · 2020-07-18T00:23:08Z

From doing some planning, I think there are three types of barplots that would be good to work on supporting (and potentially more if requested):

Assign each tip a bar of fixed length, and alternate the colors of the bars based on a feature metadata field. These could be either categorical colors (e.g. taxonomy annotations) or quantitative colors (e.g. Songbird/ALDEx2/etc. differential values, other types of feature importance scores as suggested by @shihuang047, etc.).

Example: The "Host Class" ring in Fig. 1 of Song/Sanders et al. --
Assign each tip a bar of fixed color, and alternate the lengths of the bars based on a (quantitative) feature metadata field.

Example: The relative abundance barplots in Fig. 2A of Baker et al. (not exactly comparable b/c this barplot has more than one category, but the same general idea) --
Assign each tip a bar of fixed length, and draw a stacked barplot based on this tip's sample presence information for a selected sample metadata field. (To give an idea of what this would look like, for "body site" in the moving pictures dataset, tips unique to gut samples would have a completely red bar; tips split 50/50 between left and right palm samples would have a half blue / half orange bar; and so on.)

Example: The "Diet" ring in Fig. 1 of Song/Sanders et al., see above

I imagine these are ranked roughly in order of how useful they'll be (maybe 3 and 2 could be switched around, though). So IMO it makes sense to start with the first type of barplot. (Happily, I think this will also be the easiest of the three to implement :)

Other considerations

We would ideally allow for users to select multiple "layers" of barplots, which would allow for intricate displays as shown in the Song/Sanders et al. tree above.
Barplots should work with either circular or rectangular layouts, since both of these guarantee that tips will be allocated some space to themselves in a consistent way (... if that makes sense, there's probably a more elegant way to phrase that).
- That being said, it might be best to start off with implementing these for the circular layout first, since most of the figures with barplots I've seen use a circular layout.
All of the figures above (and probably like 95% of the tree figures I've seen while working in bioinformatics, let's be real) use iTOL, so we should of course cite iTOL in the code, paper, etc. as the inspiration for this functionality.

ElDeveloper · 2020-07-18T00:58:51Z

Thanks for breaking this down @fedarko, very helpful. After thinking about this for a little bit, here's some thoughts. I had to think of it in terms of features and samples:

Feature metadata bars:
- Length defined by feature metadata variable or default size if unspecified.
- Color defined by feature metadata variable or default color - this would optionally support continuous color maps.
Sample metadata bars:
- For a categorical variable, show stacked bar chart of prevalence across samples for a metadata variable. For example percent healthy vs sick samples with each feature (we do this in the qemistree preprint - see figure 3).
- For a continuous variable, show the average value across samples for a metadata variable. For example average pH per feature. In this case color and height can be fixed (initially) but should be able to be defined by other metadata variables.

For drawing the bars, I think using shaders will be the most performant solution. I think addressing #214 should help us get startecd.

In both cases it sounds like we should allow to have multiple rings of information. In any case, I agree that we should start with the case that's easier to implement and move from there.

I agree 🎩-tip to iTOL and other tools like Anvio, ggtree, FigTree, Topiary Explorer, and so many more ✨

debatably relevant to biocore#201 only assumes rect layout, doesn't use very fancy logic, etc etc. still cool ;D

fedarko added the feature request label Jun 23, 2020

ElDeveloper added this to the Beta Release milestone Jul 7, 2020

ElDeveloper assigned fedarko and kwcantrell Jul 7, 2020

fedarko changed the title ~~Barplots in the circular layout~~ Barplots in the circular / rectangular layout Jul 17, 2020

fedarko added a commit to fedarko/empress that referenced this issue Jul 28, 2020

ENH: add (very very early) barplot-drawing code

174c498

debatably relevant to biocore#201 only assumes rect layout, doesn't use very fancy logic, etc etc. still cool ;D

This was referenced Aug 4, 2020

Draw feature / sample metadata tip-level barplots in the rectangular layout #293

Merged

Add support for circular layout barplots #297

Closed

Draw legends for each barplot layer #299

Closed

Support drawing origin for barplot length-scaling #300

Open

fedarko added the barplots label Aug 5, 2020

kwcantrell closed this as completed in #293 Aug 5, 2020

fedarko mentioned this issue Aug 25, 2020

Support drawing quantitative sample metadata barplots #353

Open

fedarko mentioned this issue Mar 30, 2021

Enable stacked barplots for feature metadata #506

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Barplots in the circular / rectangular layout #201

Barplots in the circular / rectangular layout #201

fedarko commented Jun 23, 2020

fedarko commented Jun 24, 2020

fedarko commented Jul 18, 2020

ElDeveloper commented Jul 18, 2020

Barplots in the circular / rectangular layout #201

Barplots in the circular / rectangular layout #201

Comments

fedarko commented Jun 23, 2020

fedarko commented Jun 24, 2020

fedarko commented Jul 18, 2020

Other considerations

ElDeveloper commented Jul 18, 2020