Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the strictest cloneCall? #472

Closed
airabiotenor opened this issue Feb 4, 2025 · 1 comment
Closed

What is the strictest cloneCall? #472

airabiotenor opened this issue Feb 4, 2025 · 1 comment

Comments

@airabiotenor
Copy link

airabiotenor commented Feb 4, 2025

Hello,

scRepertoire is a phenomenal tool for integrating VDJ with existing Seurat data, and I am very grateful that you have made it available! In our BCR data, we are seeing a significant amount of overlapping clones across samples, and even across two separate experiments where our team of B Cell experts is not expecting to see any overlap. We have tried a few strategies to mitigate and remove this issue, and have noticed some discrepancies in clonalOverlap and clonalCompare. According to the docs, it seems that calling clonalCompare with cloneCall = "strict" and chain = "both" should result in the most strict plots:

clonalCompare(combined.BCR,
                  samples = names(contig.list), 
                  cloneCall = "strict",
                  chain = "both",
                  graph = "alluvial") + NoLegend()

Image

Yet when I set the cloneCall to "nt", there is visibly less overlap

clonalCompare(combined.BCR,
                  samples = names(contig.list), 
                  cloneCall = "nt",
                  chain = "both",
                  graph = "alluvial") + NoLegend()

Image

What is the expected behavior of these alluvial plots? And how can we make the cloneCall as strict as possible to avoid unexpected overlap?
Note: In this experiment hto 1-4 are an entirely separate library from hto 5-7, which is why we expect very little overlap. Clones were called with

combined.BCR <- combineBCR(contig.list,
                           samples = names(contig.list),
                           threshold = .9,
                           removeNA = TRUE)

Thank you for your help and continued work on this amazing tool.

@ncborcherding
Copy link
Member

Hey @airabiotenor,

Thanks for reaching out. First off it looks like at baseline, using the nt cloneCall you are getting overlap (comparing hto4 vs hto5). Although clonalCompare() is not visualizing all the potential overlaps for the other samples. We will get into the strict definition below, but we would expect potentially an increase in overlaps because we are going to calculate edit distance

The strict clonotype is an edit distance calculation of the nucleotide sequences (based on the threshold) + Vgene. More info here. This is done across all contigs because it is agonistic to the experimental setup.

The easiest way to add an additional layer of specificity would be to amend the CTstrict column in an experiment-specific factor, here would be a simple for loop to do that:

for(i in seq_along(combined.BCR) {
     if(i %in% 1:4) {
          tag <- "Exp1"
     } else {
          tag <- "Exp2"
     }
 combined.BCR[[i]]$CTstrict <- paste0(tag,, ";", combined.BCR[[i]]$CTstrict)
}

Hope that helps and let me know if you have any additional questions.

Nick

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants