You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have successfully used DISCOVER with data from whole-exome sequencing. I am wondering whether I can use it with data from targeted sequencing data generated by MSK IMPACT or DFCI OncoPanel. Are there enough mutation events from only 300-500 genes to estimate the background mutation rate? Are there specific assumptions that are not met when one is using targeted sequencing data?
Your help would be much appreciated.
Thanks.
The text was updated successfully, but these errors were encountered:
Using DISCOVER with gene panels of a few hundred genes works very well. In the DISCOVER paper, we used whole-exome data with the assumption that the estimation of the background model benefits from having mutation data for as many genes as possible. Since then, we have also applied DISCOVER to gene panel data. We have observed that for panels of a few hundred genes the results obtained with DISCOVER are very similar. You should probably be more careful with very small gene panels though.
To illustrate the concept, have a look at the R code below, which subsets the included breast cancer mutation data to the MSK-IMPACT panel genes and compares the results with those of the whole-exome analysis.
library(discover)
data(BRCA.mut)
# Download MSK-IMPACT panel genespanel_info<- readLines(url("https://media.githubusercontent.com/media/cBioPortal/datahub/master/reference_data/gene_panels/data_gene_panel_impact505.txt"))
msk_impact_genes<- unlist(strsplit(unlist(strsplit(grep("^gene_list:", panel_info, value=TRUE), ""))[2], "\t"))
# Fit background model for full and panel mutation datamsk_impact_genes<- intersect(rownames(BRCA.mut), msk_impact_genes)
events_all_genes<- discover.matrix(BRCA.mut)
events_all_genes<-events_all_genes[msk_impact_genes, ]
events_msk_impact<- discover.matrix(BRCA.mut[msk_impact_genes, ])
# Perform DISCOVER test for genes with more than 25 mutationssubset<- rowSums(events_msk_impact$events) >25result_all_genes<- pairwise.discover.test(events_all_genes[subset, ])
result_msk_impact<- pairwise.discover.test(events_msk_impact[subset, ])
# Compare the resulting P valuesmask<- lower.tri(result_all_genes$p.values)
p_all_genes<-result_all_genes$p.values[mask]
p_msk_impact<-result_msk_impact$p.values[mask]
plot(-log10(p_all_genes), -log10(p_msk_impact))
Hi,
I have successfully used DISCOVER with data from whole-exome sequencing. I am wondering whether I can use it with data from targeted sequencing data generated by MSK IMPACT or DFCI OncoPanel. Are there enough mutation events from only 300-500 genes to estimate the background mutation rate? Are there specific assumptions that are not met when one is using targeted sequencing data?
Your help would be much appreciated.
Thanks.
The text was updated successfully, but these errors were encountered: