forked from YuLab-SMU/clusterProfiler-book
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path07-msigdb.Rmd
36 lines (26 loc) · 1.28 KB
/
07-msigdb.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# MSigDb analysis {#chapter7}
```{r include=FALSE}
library(knitr)
opts_chunk$set(message=FALSE, warning=FALSE, eval=TRUE, echo=TRUE, cache=TRUE)
library(clusterProfiler)
```
The MSigDB is a collection of annotated gene sets, it include 8 major collections:
* H: hallmark gene sets
* C1: positional gene sets
* C2: curated gene sets
* C3: motif gene sets
* C4: computational gene sets
* C5: GO gene sets
* C6: oncogenic signatures
* C7: immunologic signatures
Users can use `enricher` and `GSEA` function to analyze gene set collections downloaded from Molecular Signatures Database ([MSigDb](http://www.broadinstitute.org/gsea/msigdb/index.jsp)). [clusterProfiler](https://www.bioconductor.org/packages/clusterProfiler) provides a function, `read.gmt`, to parse the [gmt file](www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats#GMT:_Gene_Matrix_Transposed_file_format_.28.2A.gmt.29) into a _TERM2GENE_ `data.frame` that is ready for both `enricher` and `GSEA` functions.
```{r}
data(geneList, package="DOSE")
gene <- names(geneList)[abs(geneList) > 2]
gmtfile <- system.file("extdata", "c5.cc.v5.0.entrez.gmt", package="clusterProfiler")
c5 <- read.gmt(gmtfile)
egmt <- enricher(gene, TERM2GENE=c5)
head(egmt)
egmt2 <- GSEA(geneList, TERM2GENE=c5, verbose=FALSE)
head(egmt2)
```