-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some rRNA not mapped to the reference genome #2
Comments
Hello:
|
Hello,
Thank you. |
Just to clarify, Yuchen – are you mapping to the custom genome (Mouse_mm10-rDNA_genome_v1.0.tar.gz) on our Github or are you mapping to the standard mm10 genome that you get from NCBI? Because in our custom genome, chr17:39842543-39849275 is in the list of regions that should have been masked (which means that sequence should just be “NNNNN…” and nothing should map to it). The locus you gave us falls inside that region. Can you check yourself and see whether our Mouse_mm10-rDNA_genome_v1.0.tar.gz genome has the “NNNNN…” sequence there?
Vikram
From: Xuyuch ***@***.***>
Reply-To: vikramparalkar/rDNA-Mapping-Genomes ***@***.***>
Date: Wednesday, September 25, 2024 at 11:42 AM
To: vikramparalkar/rDNA-Mapping-Genomes ***@***.***>
Cc: "Paralkar, Vikram" ***@***.***>, Comment ***@***.***>
Subject: [External] Re: [vikramparalkar/rDNA-Mapping-Genomes] Some rRNA not mapped to the reference genome (Issue #2)
Hello,
Thanks for your help and sorry for the late reply. Here is the information i have
1. The Genomic locus of Rn18s-rs5 chr17:39846354-chr17:39848202. More information could be found in NCBI Gene database with Gene ID: 110183.
2. I did a blast between Rn18s-rs5 and ChrR. Seems most of them are matched there are small region mismatches(no more than 20bp) between Rn18s-rs5 sequence and BK000964.3(NCBI rRNA region). According the paper, you also used BK000964.3 as a base to build the chromosomeR.
3. I haven't tried remapping with the mm39 genome. But I tried to map to BK000964.3. And the same problem occurs.
Thank you.
Yuchen
—
Reply to this email directly, view it on GitHub<#2 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/A2TPUHU4ZLZFRYNMXMA7YZ3ZYLKVVAVCNFSM6AAAAABOCBAKCOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZUGQ2TGOJSGM>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Hi Vikram. Here is how i found the Rn18s-rs5. In short, I found it in the unmaped reads after mapping to the genome in the github. I used the custom genome from the github(Mouse_mm10-rDNA_genome_v1.0.tar.gz) and get all mapped rRNA reads. And I double checked the remaining unmapped reads which I assume should not have any rRNA reads. However, within the unmapped reads, I found this Rn18s-rs5 which should be a rRNA gene. In another word, I assumed all rRNA reads should be perfectly aligned to chrR while some reads,which mapped to Rn18s-rs5, does not mapped to chrR and been left in the unmaped reads. |
Hmm, that's interesting. I wonder if it has something to do with Bowtie being used first and STAR aligner being used second? We have never tried STAR aligner for this. What if you use Bowtie again for your second mapping to the standard genome? Or use STAR first and Bowtie second? I wonder if it's something to do with STAR being a "better" aligner perhaps?
Also, could you quantify the actual number of reads mapping to the Rn18s-rs5 gene (from the "unmapped reads" file), and what that number is compared to the chrR reads that mapped in the initial run? Are the Rn18s-rs5 reads a small fraction of the chrR reads (like <5%, in which case they likely won't matter), or is it a massive number (like >50%)?
Vikram
…________________________________
From: Xuyuch ***@***.***>
Sent: Wednesday, September 25, 2024 3:46 PM
To: vikramparalkar/rDNA-Mapping-Genomes ***@***.***>
Cc: Paralkar, Vikram ***@***.***>; Comment ***@***.***>
Subject: [External] Re: [vikramparalkar/rDNA-Mapping-Genomes] Some rRNA not mapped to the reference genome (Issue #2)
Hi Vikram.
Thanks for you clarify. I get what you mean and understand those reads should not be mapped to the customized genome according to the mask method you used. But that is the thing I am worried about. According to Refseq, it is a 18s rRNA, but the results shows it is neither mapped to chrR nor auto-chromosome.
Here is how i found the Rn18s-rs5. In short, I found it in the unmaped reads after mapping to the genome in the github.
I used the custom genome from the github(Mouse_mm10-rDNA_genome_v1.0.tar.gz) and get all mapped rRNA reads. And I double checked the remaining unmapped reads which I assume should not have any rRNA reads. However, within the unmapped reads, I found this Rn18s-rs5 which should be a rRNA gene.
In another word, I assumed all rRNA reads should be perfectly aligned to chrR while some reads,which mapped to Rn18s-rs5, does not mapped to chrR and been left in the unmaped reads.
Hope I explain my worries clear enough and please let me know if my thoughts make sense to you.
—
Reply to this email directly, view it on GitHub<#2 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/A2TPUHXMQD3GNOKBRPVNW33ZYMHJ5AVCNFSM6AAAAABOCBAKCOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZVGEYTEOJXGE>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Hi Vikram: And I raised this point because Rn18s-rs5 reads seems dominant in my unmaped reads. Within 2,932,041 ummaped reads I put into STAR(around 30% of total reads), it has 457,111 Rn18s-rs5 reads. Because we got the data from a ribosome foot-printing experiment so most of them are rRNA. Rn18s-rs5 reads are not a large fraction of total counts, however, for a single gene, this gene's count is way more larger than a typical mrna (like Actb, a typical house keeping gene, only have reads of 4,166) and comparable to most of our 18s rRNA reads which aligned to the chrR in the github. |
What was the total number of reads that mapped to the full chrR, and to the 18S and 28S portions of chrR in your initial Bowtie run?
Vikram
…________________________________
From: Xuyuch ***@***.***>
Sent: Wednesday, September 25, 2024 8:45 PM
To: vikramparalkar/rDNA-Mapping-Genomes ***@***.***>
Cc: Paralkar, Vikram ***@***.***>; Comment ***@***.***>
Subject: [External] Re: [vikramparalkar/rDNA-Mapping-Genomes] Some rRNA not mapped to the reference genome (Issue #2)
Hi Vikram:
Yep! I tried do Bowtie2 alignment twice and for the second run it does not give me any mapped reads so I think STAR may align more reads in some specific circumstance.
And I raised this point because Rn18s-rs5 reads seems dominant in my unmaped reads. Within 2,932,041 ummaped reads I put into STAR(around 30% of total reads), it has 457,111 Rn18s-rs5 reads. Because we got the data from a ribosome foot-printing experiment so most of them are rRNA. Rn18s-rs5 reads are not a large fraction of total counts, however, for a single gene, this gene's count is way more larger than a typical mrna (like Actb, a typical house keeping gene, only have reads of 4,166) and comparable to most of our 18s rRNA reads which aligned to the chrR in the github.
Thank you!
Yuchen
—
Reply to this email directly, view it on GitHub<#2 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/A2TPUHRLBWXELQEQ75J34NDZYNKKLAVCNFSM6AAAAABOCBAKCOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZVGUYTGOBTGM>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Dear team;
Thanks for providing those wonderful reference genome. It helps me a lot in my research. I met a problem that many of my reads could mapped to rrna region but was classified to not aligned when using bowtie2 alignment. Here is the details:
I used
bowtie2 -x rRNA_mm10_reference_genome/bowtie2_index/rRNA -U ../clean_Sample_1_R1.fastq.gz -S rRNA_trash.sam --un mrna_sample1.fastq
And I assume unmapped as normal mrna.For the next step, I put the unmapped reads to STARaligner, and got the mapped mrna reads.
Later, I used feature count to get the count matrix
featureCounts -a gencode.vM10.annotation.gtf -g gene_name -o count_sample1_R1_ds sample1_mRNA_R1Aligned.sortedByCoord.out.bam
And I found there are huge numbers of Rn18s-rs5 gene, which is a typical 18s rRNA. But it show up in the mrna files (All rRNA should be filter out) that should not contain any rRNA. I double check those reads in the genome browser(IGV) and majority of them does not have splice sites. Could you help me check this issue? Thanks a lot.Yuchen
The text was updated successfully, but these errors were encountered: