Skip to content

Commit

Permalink
Merge pull request #87 from bcbio/feature_rnaseq_qc_methods_shs
Browse files Browse the repository at this point in the history
Add methods draft and restructure similarity analysis
  • Loading branch information
lpantano authored Jan 31, 2025
2 parents 7e048a9 + c98bd9a commit 43ed340
Showing 1 changed file with 29 additions and 13 deletions.
42 changes: 29 additions & 13 deletions inst/templates/rnaseq/01_quality_assesment/QC.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -418,19 +418,6 @@ pca2 + scale_color_grafify(palette = "kelly")
```

# Covariates analysis

When there are multiple factors that can influence the results of a given experiment, it is useful to assess which of them is responsible for the most variance as determined by PCA. This method adapts the method described by Daily et al. for which they integrated a method to correlate covariates with principal components values to determine the importance of each factor.

```{r covariate-plot,fig.height=12, fig.width=10}
## Remove non-useful columns output by nf-core
coldat_2 <- data.frame(coldat_for_pca[,!(colnames(coldat_for_pca) %in% c("fastq_1", "fastq_2", "salmon_library_types", "salmon_compatible_fragment_ratio", "samtools_reads_mapped_percent", "samtools_reads_properly_paired_percent", "samtools_mapped_passed_pct", "strandedness", "qualimap_5_3_bias"))])
# Remove missing data
coldat_2 <- na.omit(coldat_2)
degCovariates(vst, metadata = coldat_2)
```

## Hierarchical clustering

Inter-correlation analysis (ICA) is another way to look at how well samples
Expand Down Expand Up @@ -461,6 +448,35 @@ p <- pheatmap(vst_cor,
p
```

# Covariates analysis

When there are multiple factors that can influence the results of a given experiment, it is useful to assess which of them is responsible for the most variance as determined by PCA. This method adapts the method described by Daily et al. for which they integrated a method to correlate covariates with principal components values to determine the importance of each factor.

```{r covariate-plot,fig.height=12, fig.width=10}
## Remove non-useful columns output by nf-core
coldat_2 <- data.frame(coldat_for_pca[,!(colnames(coldat_for_pca) %in% c("fastq_1", "fastq_2", "salmon_library_types", "salmon_compatible_fragment_ratio", "samtools_reads_mapped_percent", "samtools_reads_properly_paired_percent", "samtools_mapped_passed_pct", "strandedness", "qualimap_5_3_bias"))])
# Remove missing data
coldat_2 <- na.omit(coldat_2)
degCovariates(vst, metadata = coldat_2)
```

# Conclusions



# Methods

RNA-seq counts were generated by the nf-core rnaseq pipeline [version] using Salmon (Patro et al. 2017). Downstream analyses were performed using `r version$version.string`. Counts were imported into R using DESeq2 version `r packageVersion("DESeq2")` (Love, Huber, and Anders 2014). Gene annotations were obtained from Ensembl. Plots were generated by ggplot2 (Wickham 2016). Heatmaps were generated by pheatmap (Kolde 2019).

## R package references

```{r citations}
citation("DESeq2")
citation("ggplot2")
citation("pheatmap")
```

# R session

List and version of tools used for the QC report generation.
Expand Down

0 comments on commit 43ed340

Please sign in to comment.