Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a warning in the RMD report when some rows are lost after merging Diff and Fatures tables #354

Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [[#345](https://github.com/nf-core/differentialabundance/pull/345)] - Plot differentially expressed genes by gene biotype ([@atrigila](https://github.com/atrigila), review by [@grst](https://github.com/grst))
- [[#343](https://github.com/nf-core/differentialabundance/pull/343)] - Add pipeline-level nf-tests ([@atrigila](https://github.com/atrigila), review by [@pinin4fjords](https://github.com/pinin4fjords) and [@nschcolnicov](https://github.com/nschcolnicov))
- [[#286](https://github.com/nf-core/differentialabundance/pull/286)] - Integration of limma voom for rnaseq data ([@KamilMaliszArdigen](https://github.com/KamilMaliszArdigen), review by [@pinin4fjords](https://github.com/pinin4fjords))
- [[#354](https://github.com/nf-core/differentialabundance/pull/354)] - Warning message within the R Markdown report to control when genes don't have annotation data ([@alanmmobbs93](https://github.com/alanmmobbs93)). Review by [@WackerO](https://github.com/WackerO) and [@pinin4fjords](https://github.com/pinin4fjords).

### Fixed

Expand Down
86 changes: 75 additions & 11 deletions assets/differentialabundance_report.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -370,10 +370,17 @@ differential_files <- lapply(contrasts$id, function(d){
file.path(params$input_dir, paste0(gsub(' |;', '_', d), differential_file_suffix))
})

differential_results <- lapply(differential_files, function(diff_file){
if (! file.exists(diff_file)){
# Initialize vector to store warning messages before merging tables
warnings_list <- c()

# Read differential results and merge with features table
results <- lapply(differential_files, function(diff_file) {
warnings <- c() # Initialize local warning vector

if (!file.exists(diff_file)) {
stop(paste("Differential file", diff_file, "does not exist"))
}

diff <- read_differential(
diff_file,
feature_id_column = params$differential_feature_id_column,
Expand All @@ -382,19 +389,71 @@ differential_results <- lapply(differential_files, function(diff_file){
qval_column = params$differential_qval_column
)

# If fold changes are not logged already, log them (we assume they're logged
# later on)

if (! params$differential_foldchanges_logged){
# If fold changes are not logged already, log them
if (!params$differential_foldchanges_logged) {
diff[[params$differential_fc_column]] <- log2(diff[[params$differential_fc_column]])
}

# Annotate differential tables if possible
if (! is.null(params$features)){
diff <- merge(features, diff, by.x = params$features_id_col, by.y = params$differential_feature_id_column)
if (!is.null(params$features)) {

# Merge tables
diff_features <- merge(features, diff, by.x = params$features_id_col, by.y = params$differential_feature_id_column)
d4straub marked this conversation as resolved.
Show resolved Hide resolved

# Get number of rows before and after merging
rows_diff <- as.numeric(nrow(diff))
rows_diff_features <- as.numeric(nrow(diff_features))

# Check that all IDs were conserved
conserved_ids <- all( diff[[params$differential_feature_id_column]] %in% diff_features[[params$features_id_col]] )

## Check if all IDs are present
if (!conserved_ids) {
missing_ids <- setdiff(diff[[params$differential_feature_id_column]], diff_features[[params$features_id_col]])
warnings <- c(warnings,
paste0(
'<p style="color:#DAA520;"><strong>WARNING:</strong>', length(missing_ids),' IDs from the differential expressed table (', basename(diff_file), ') were absent from the features table (', basename(params$features), ') and lost on merge.\n',
'Missing IDs in diff table: ', paste(missing_ids, collapse = ' '), '.\n',
'Rows in merged table: ', rows_diff_features, '.</p>\n'
)
)
}

# Compare numbers and report
## Check if features_diff has fewer rows, it would indicate lost of info
if ( rows_diff_features < rows_diff ) {
warnings <- c(warnings,
paste0(
'<p style="color:#DAA520;"><strong>WARNING:</strong> Some rows from the differential expressed table (', basename(diff_file), ') were absent from the features table (', basename(params$features), ') and lost on merge.\n',
'Rows in diff table: ', rows_diff, '.\n',
'Rows in merged table: ', rows_diff_features, '.</p>\n'
)
)
}

## Check if features_diff has more rows, it could indicate duplications
if ( rows_diff_features > rows_diff ) {
warnings <- c(warnings,
paste0(
'<p style="color:#DAA520;"><strong>WARNING:</strong> Some rows from the differential expressed table (', basename(diff_file), ') were duplicated on feature table (', basename(params$features), ').\n',
'Rows in diff table: ', rows_diff, '.\n',
'Rows in merged table: ', rows_diff_features, '.</p>\n'
)
)
}

} else {
diff_features <- diff
}
diff

# Return both the results and the local warnings
list(diff_features = diff_features, warnings = warnings)
})

# Separate differential_results and warnings_list from results
differential_results <- lapply(results, `[[`, "diff_features")
warnings_list <- unlist(lapply(results, `[[`, "warnings"))

names(differential_results) <- contrasts$id
```

Expand Down Expand Up @@ -787,7 +846,6 @@ foo <- lapply(names(p_value_types), function(pvt){
```

```{r, echo=FALSE, results='asis', eval = FALSE}

differential_summary_string <- paste(
paste(
lapply(
Expand All @@ -806,7 +864,13 @@ cat(differential_summary_string)

### Differential `r params$features_type` details

```{r, echo=FALSE, results='asis'}
```{r, echo=FALSE, results='asis', warning=FALSE, message=FALSE}

# Display all warnings related to number of rows
if (length(warnings_list) > 0) {
for (warning in warnings_list) { cat(warning) }
}

for (i in 1:nrow(contrasts)){
cat("\n#### ", contrast_descriptions[i], " {.tabset}\n")

Expand Down
Loading