Skip to content
This repository has been archived by the owner on Oct 2, 2024. It is now read-only.

Generate summary enrichment heatmaps #55

Merged
merged 8 commits into from
Feb 29, 2024
Merged

Conversation

rjcorb
Copy link
Contributor

@rjcorb rjcorb commented Feb 12, 2024

Purpose/implementation Section

What scientific question is your analysis addressing?

Closes #51. This PR creates a function to generate enrichment heatmaps, and applies function to 02-summary-stats.R heatmaps.

What was your approach?

  • Write function to generate enrichment heatmaps for two factors
  • Applies function to create heatmaps for ancestry x race and ancestry x ethnicity. The function is also applied to regenerate heatmaps that were previously made with ggplot2 and only showed counts.

What GitHub issue does your pull request address?

#51

Directions for reviewers. Tell potential reviewers what kind of feedback you are

soliciting.

Which areas should receive a particularly close look?

Please review heatmap_function.R code, and newly outputted enrichment heatmaps.

Is there anything that you want to discuss further?

No

Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are

ready for review?

Yes

Results

What types of results are included (e.g., table, figure)?

  • race_ancestry_ct_enr_heatmap.pdf
  • ethnicity_ancestry_ct_enr_heatmap.pdf
  • lgg_tumor_resection_by_predicted_ancestry.pdf
  • dmg_region_ancestry_ct_enr_heatmap.pdf
  • lgg_tumor_location_by_predicted_ancestry.pdf

What is your summary of the results?

  • Genetic ancestry groups are most enriched in expected self-reported race and ethnicity categories. AMR and SAS ancestries were significantly enriched among patients for which reported race is unknown.
  • AMR and EAS pLGG patients are significantly more likely to have received only a biopsy relative to other ancestries

Reproducibility Checklist

  • The dependencies required to run the code in this pull request have been added to the
    project Dockerfile.
  • This analysis has been added to continuous integration.

Documentation Checklist

  • This analysis module has a README and it is up to date.
  • The analytical code is documented and contains comments.

Copy link
Collaborator

@jharenza jharenza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

For this plot - I think that "Mixed" region is not really informative - maybe not informative for any plots to be honest- it is essentially where we had primary sites which spanned regions defined by Cassie. DMG specifically is defined by originating in the midline - spine, pons, thalamus, brainstem. So, the PF location doesn't make a ton of sense and may be a mis-annotation, and if we examined the images, we probably could bucket the mixed regions appropriately. May be more work than we can do for this paper, but it could be worth, just for DMG - annotating those 4 (spine, pons, thalamus, brainstem) though may need clinician review of images), and assessing ancestry enrichment by one or another. So what I am saying is in the end, this plot is really a function of the prevalence of ancestry for DMG.

plot group by ancestry unknown race is blank- should that be removed?

Can you also add script 03 to the bash script?

Base automatically changed from rjcorb/50-add-breakpoint-enr to main February 13, 2024 18:30
@rjcorb rjcorb merged commit f91742a into main Feb 29, 2024
1 check passed
@rjcorb rjcorb deleted the rjcorb/51-race-ethn-heatmaps branch February 29, 2024 14:51
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

generate ancestry-race and ancestry-ethnicity enrichment heatmaps
2 participants