Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V-22OCT-01 not called correctly #12

Open
ulfschaefer opened this issue Dec 16, 2022 · 3 comments
Open

V-22OCT-01 not called correctly #12

ulfschaefer opened this issue Dec 16, 2022 · 3 comments

Comments

@ulfschaefer
Copy link

ulfschaefer commented Dec 16, 2022

monologue-underling variants get called as "alt-probable" although they should be confirmed. The reason seems to be that the MNP P13L gets called as a wild type, when it is actually in the sequence:

                    {
                        "amino-acid-change": "P13L",
                        "codon-change": "CCC-CTT",
                        "gene": "N",
                        "one-based-reference-position": 28310,
                        "predicted-effect": "non-synonymous",
                        "protein": "nucleocapsid phosphoprotein",
                        "protein-codon-position": 13,
                        "reference-base": "CCC",
                        "type": "MNP",
                        "variant-base": "CTT",
                        "status": "no-detect"
                    }

In my example both positions 28311 and 28312 are T. I suspect the problem is related to the "one-based-reference-position" pointing to a base that is ref in the sample.

I am attaching the example I used.

example_barcode05.muscle.aln.fasta.zip

I spoke to the author of the definitions and we agreed that MNPs are denoted inconsistently across the yaml files. There will be an update so that all MNPs will:

  • MNPs will always be per-codon, so length 3, even if they span two neighbouring codons (neighbouring codons = two MNPs)
  • the "one-based-reference-position" will always be the position of the first nucleotide in the codon, whether that changes or not
  • the "reference-base" and the "variant-base", will always have all 3 nucleotides, whether they change or not, essentially giving the same information than the 'codon-change' field
  • non-changing nucleotides in the MNP will be shown as N in the "variant-base" fields to indicate that the MNP is to be called no matter what the query sequence is at that position.

Sorry about the faff.
Ulf

@abeazer
Copy link
Collaborator

abeazer commented Jan 27, 2023

Hello Ulf, apologies for the delay in our response and thank you for raising this issue. This looks to be related to an issue we've identified with how aln2type handles MNPs and are working release an update to aln2type that fixes this issue as well as make it compatible with the new style of definition.

@ulfschaefer
Copy link
Author

Thanks abeazer, that all sounds good.

FYI, I have had a similar issue that V-23JAN-01 (XBB.1.5) are called as probable when they should be confirmed. It was because the variant for F486P is not called even though it's definitely in the sequence.

{
                            "amino-acid-change": "F486P",
                            "codon-change": "TTT-CCT",
                            "gene": "S",
                            "one-based-reference-position": 23018,
                            "predicted-effect": "non-synonymous",
                            "protein": "surface glycoprotein",
                            "protein-codon-position": 486,
                            "reference-base": "TTT",
                            "type": "MNP",
                            "variant-base": "CCN",
                            "sample-call": "CCC"
                        },

definitely CCT in the sample in the above case.

Thanks
Ulf

@abeazer
Copy link
Collaborator

abeazer commented Jan 27, 2023

Hi Ulf, thanks again!

We've spotted this issue and we've found its due to aln2type currently being incompatible the newer definition style of using the codon for the variant-base. Updating aln2type to the new style is the other major fix we're working on.

In the meantime, adjusting the definition yaml to reference-base: TT and variant-base: CC will allow aln2type to correctly call the mutation.

Thanks!
Andi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants