Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a warning in the RMD report when some rows are lost after merging Diff and Fatures tables #354

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [[#345](https://github.com/nf-core/differentialabundance/pull/345)] - Plot differentially expressed genes by gene biotype ([@atrigila](https://github.com/atrigila), review by [@grst](https://github.com/grst))
- [[#343](https://github.com/nf-core/differentialabundance/pull/343)] - Add pipeline-level nf-tests ([@atrigila](https://github.com/atrigila), review by [@pinin4fjords](https://github.com/pinin4fjords) and [@nschcolnicov](https://github.com/nschcolnicov))
- [[#286](https://github.com/nf-core/differentialabundance/pull/286)] - Integration of limma voom for rnaseq data ([@KamilMaliszArdigen](https://github.com/KamilMaliszArdigen), review by [@pinin4fjords](https://github.com/pinin4fjords))
- [[#354](https://github.com/nf-core/differentialabundance/pull/354)] - Warning message within the R Markdown report to control when genes don't have annotation data ([@alanmmobbs93](https://github.com/alanmmobbs93)). Review by [@WackerO](https://github.com/WackerO) and [@pinin4fjords](https://github.com/pinin4fjords).

### Fixed

Expand Down
66 changes: 52 additions & 14 deletions assets/differentialabundance_report.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -370,10 +370,13 @@ differential_files <- lapply(contrasts$id, function(d){
file.path(params$input_dir, paste0(gsub(' |;', '_', d), differential_file_suffix))
})

differential_results <- lapply(differential_files, function(diff_file){
if (! file.exists(diff_file)){
stop(paste("Differential file", diff_file, "does not exist"))
}
# Initialize vector to store warning messages before merging tables
warnings_list <- c()

# Read differential results and merge with features table
results <- lapply(differential_files, function(diff_file) {
if (!file.exists(diff_file)) stop(paste("Differential file", diff_file, "does not exist"))

diff <- read_differential(
diff_file,
feature_id_column = params$differential_feature_id_column,
Expand All @@ -382,19 +385,49 @@ differential_results <- lapply(differential_files, function(diff_file){
qval_column = params$differential_qval_column
)

# If fold changes are not logged already, log them (we assume they're logged
# later on)

if (! params$differential_foldchanges_logged){
# Log transform fold changes if not already logged
if (!params$differential_foldchanges_logged) {
diff[[params$differential_fc_column]] <- log2(diff[[params$differential_fc_column]])
}

# Annotate differential tables if possible
if (! is.null(params$features)){
diff <- merge(features, diff, by.x = params$features_id_col, by.y = params$differential_feature_id_column)
# Annotate differential table if features table is provided
if (!is.null(params$features)) {
## Merge Differential expression table on features table
merged <- merge(features, diff, by.x = params$features_id_col, by.y = params$differential_feature_id_column)

## Get number of missing rows
n_missing <- length(setdiff(diff[[params$differential_feature_id_column]], merged[[params$features_id_col]]))

## Create warnings if necessary
warnings <- c(
## Missing IDs
if (n_missing > 0) sprintf(
'<p style="color:#DAA520;"><strong>WARNING:</strong> %d IDs from the differential table (%s) were lost on merge with features table (%s).</p>',
n_missing, basename(diff_file), basename(params$features)
),
## Check whether there are fewer rows, missing data
if (nrow(merged) < nrow(diff)) sprintf(
'<p style="color:#DAA520;"><strong>WARNING:</strong> Rows were lost on merge (%s -> %s). Original: %d, Merged: %d.</p>',
basename(diff_file), basename(params$features), nrow(diff), nrow(merged)
),
## Check whether there are more rows, possible duplications
if (nrow(merged) > nrow(diff)) sprintf(
'<p style="color:#DAA520;"><strong>WARNING:</strong> Rows were duplicated on merge (%s -> %s). Original: %d, Merged: %d.</p>',
basename(diff_file), basename(params$features), nrow(diff), nrow(merged)
)
)
} else {
merged <- diff
warnings <- character(0)
}
diff
## Collect results
list(diff_features = merged, warnings = warnings)
})

# Separate differential_results and warnings_list from results
differential_results <- lapply(results, `[[`, "diff_features")
warnings_list <- unlist(lapply(results, `[[`, "warnings"))

names(differential_results) <- contrasts$id
```

Expand Down Expand Up @@ -787,7 +820,6 @@ foo <- lapply(names(p_value_types), function(pvt){
```

```{r, echo=FALSE, results='asis', eval = FALSE}

differential_summary_string <- paste(
paste(
lapply(
Expand All @@ -806,7 +838,13 @@ cat(differential_summary_string)

### Differential `r params$features_type` details

```{r, echo=FALSE, results='asis'}
```{r, echo=FALSE, results='asis', warning=FALSE, message=FALSE}

# Display all warnings related to number of rows
if (length(warnings_list) > 0) {
for (warning in warnings_list) { cat(warning) }
}

for (i in 1:nrow(contrasts)){
cat("\n#### ", contrast_descriptions[i], " {.tabset}\n")

Expand Down
Loading