-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different Selective forces on a Gene with "L" status #201
Comments
Hi Vinita,
|
Greetings Dr. Hiller @MichaelHiller Thank you for your prompt reply. Following your suggestion For the transcript where initially I got RELAX selection, after 10 times run I got following outcomes: 8 times RELAX was significant & 2 times Intensification was significant For the transcript where I got the initial selection Intensifying I got following outcomes after 10 runs: 6 times RELAX was significant; 4 times intensification was significant In all 10 times Likelihood ratio test p = 0.0000 (except at one run this was 0.037) at p<=0.05
BUT the interesting part comes:
For example I have finalized that in my focal branch there are total 30 genes which are with clear "L" in every single in-group species, Do I need to check for every gene in transcriptome to cross check? BTW, another thing I wanted to point out here that, when I tested an overlap of RELAX gene with gene loss data this is only single gene with status "L" I got, which is present in both RELAX and loss list. All other overlap genes between gene loss and RELAX datasets are all under "UL" category in all my in-group data. (I did this because I was curious to see if I have list of gene that we are saying loss including UL as well, how many are under RELAX) Looking forward to know your best suggestion to solve such cases from TOGA and possible way to explain this scientifically. Because previously we were expecting this candidate gene as an Adaptive loss at focal branch but now it seems confusing to us. Your guidance will help us to go forward Thanks again |
Regarding 1), Yes. The first gene (8 of 10) is obviously more supported than the second (6 of 10). Regarding 2), no. After a gene lost its protein-coding capacity, meaning it cannot be translated anymore into a fct protein, it can still be expressed for some time. After all, if the transcript that now is non-coding (or produces only a crippled protein) doesn't harm the cell, there is no pressure to shutoff transcription. Also, you can check if the RNA-seq supports different splice sites than the ones indicated by TOGA. If so, the splice sites may be intact and the gene may not be lost, as you then have only a frameshift left in the exon 5 (which may also be a splice site mutation). Transcriptome based validation is only necessary for genes that have few mutation or mostly splice site mutations.
|
Thank you very much Dr. @MichaelHiller, This helps a lot! I will check the alignments with RNA seq data and look for exon-intron structure. I will write back to you regarding this issue if we unable to solve at sequence level. Again, appreciated your thoughtful feedbacks Best Regards |
Greetings Dr. Hiller @MichaelHiller
I wanted to clarify some of my queries regrading L and UL genes that I also asked in previous thread(#183).
For UL gene loss you suggested me to check out both the transcriptome and RELAX test.
I performed RELAX test (HyPhy) by providing all the in-group as test branches and all out-group as reference branches. To HyPhy I provided codon alignment after removal of all the genes those do not have sequnces at all (---).
codon.fasta
have sequnces with transcripts ids as header with gene name like this : ENSDART0000004453.fgf10a | CODON | REFERENCE (for ref) and ENSDART0000004453.fgf10a | CODON | QUERY (for query). I only grep for query species by taking help from resources from TOGA discussion page (cat codon.fasta | grep "CODON | QUERY" -w -A 1 | grep "^\-\-$" -v | awk '{if ($1 ~ /^>/) printf $1"\t"; else print $0}' | sed 's/-//g' | awk -F "\t" '{if ($2 != "") print $1"\n"$2}'
> Sp1Codon.fasta)On my focal branch of tree , where in all in-group species a gene X is lost (TOGA status is "L" in every species). This gene have two transcript, t1 and t2. t1 is under RELAX selection and t2 is under Evidence for intensification of selection and out of both t1 have Likelihood ratio test p = 0.0000 and t2 had Likelihood ratio test p = 0.0007, both with p<=0.05
Its a very confusing situation for me .
I am also adding a picture of mutation plot of this gene for both transcript and I found strange thing that in all my 11 focal species all mutations are same (Exact identical mutation shared in exons)
I am not 100% sure if I did any mistake during this whole process , please help me to understand this case.
Looking forward to hear from you
Thank you
Vinita
The text was updated successfully, but these errors were encountered: