Can DISCOVER be used with data from targeted sequencing? #26

jud-b · 2024-03-06T10:10:21Z

Hi,

I have successfully used DISCOVER with data from whole-exome sequencing. I am wondering whether I can use it with data from targeted sequencing data generated by MSK IMPACT or DFCI OncoPanel. Are there enough mutation events from only 300-500 genes to estimate the background mutation rate? Are there specific assumptions that are not met when one is using targeted sequencing data?
Your help would be much appreciated.

Thanks.

scanisius · 2024-03-13T13:09:07Z

Using DISCOVER with gene panels of a few hundred genes works very well. In the DISCOVER paper, we used whole-exome data with the assumption that the estimation of the background model benefits from having mutation data for as many genes as possible. Since then, we have also applied DISCOVER to gene panel data. We have observed that for panels of a few hundred genes the results obtained with DISCOVER are very similar. You should probably be more careful with very small gene panels though.

To illustrate the concept, have a look at the R code below, which subsets the included breast cancer mutation data to the MSK-IMPACT panel genes and compares the results with those of the whole-exome analysis.

library(discover)

data(BRCA.mut)

# Download MSK-IMPACT panel genes
panel_info <- readLines(url("https://media.githubusercontent.com/media/cBioPortal/datahub/master/reference_data/gene_panels/data_gene_panel_impact505.txt"))
msk_impact_genes <- unlist(strsplit(unlist(strsplit(grep("^gene_list:", panel_info, value = TRUE), " "))[2], "\t"))


# Fit background model for full and panel mutation data
msk_impact_genes <- intersect(rownames(BRCA.mut), msk_impact_genes)

events_all_genes <- discover.matrix(BRCA.mut)
events_all_genes <- events_all_genes[msk_impact_genes, ]

events_msk_impact <- discover.matrix(BRCA.mut[msk_impact_genes, ])


# Perform DISCOVER test for genes with more than 25 mutations
subset <- rowSums(events_msk_impact$events) > 25

result_all_genes <- pairwise.discover.test(events_all_genes[subset, ])
result_msk_impact <- pairwise.discover.test(events_msk_impact[subset, ])


# Compare the resulting P values
mask <- lower.tri(result_all_genes$p.values)
p_all_genes <- result_all_genes$p.values[mask]
p_msk_impact <- result_msk_impact$p.values[mask]

plot(-log10(p_all_genes), -log10(p_msk_impact))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can DISCOVER be used with data from targeted sequencing? #26

Can DISCOVER be used with data from targeted sequencing? #26

jud-b commented Mar 6, 2024

scanisius commented Mar 13, 2024 •

edited

Loading

Can DISCOVER be used with data from targeted sequencing? #26

Can DISCOVER be used with data from targeted sequencing? #26

Comments

jud-b commented Mar 6, 2024

scanisius commented Mar 13, 2024 • edited Loading

scanisius commented Mar 13, 2024 •

edited

Loading