Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ribodetector unable to remove rRNA reads from Toxoplasma gondii scRNASeq SmartSeq2 data #56

Open
Rohit-Satyam opened this issue Jan 17, 2025 · 3 comments

Comments

@Rohit-Satyam
Copy link

Hi @alicemchardy @dawnmy

I am trying to use Ribodetector on SmartSeq2 data from a published study. They reported abundance of Ribosomal reads and so I thought to eliminate them using your tools. This data ghas 71bp long PE reads.
When I run Ribdetector and perform alignment, I see a lot of unmapped reads are actually 28S ribosomal reads. I began to wonder if Toxoplasma Ribosome rRNA is present in your RiboDetector training set. I tried -e rrna and -e norrna but I don't see much improvement and I still get ribosomal reads as unmapped. I also tried to discuss with Alexdobin why they do not align properly here

>SRR11062959.436379 0:N:  00
CGGCGAGTGAACCGGGATCAGCTCAAAGTGGAAATCAACTGCTCTTTCCGAGCTGTTGACTTGTAGCCTCG
>SRR11062959.436381 0:N:  01
GGCCTGGCTCGGGCATATTAACCCGATTCCCTTTCGCGAAACGAGGTAAAGACACATAACGTATACGCAGT
>SRR11062959.436382 0:N:  00
ACCGATACCAGGCCATTGCTACTAATTATCACTCGACCTTTTGTAGCAATGAGTAGGAAGACGTGGGGATT
>SRR11062959.436384 0:N:  00
GTACTTTTTTTTTTTTTTTTTTTTTTTTTTTTTCACAACAACAAAGAAAAAAAAAACCCAAGTTAATTTTG
>SRR11062959.436385 0:N:  00
GAGTTATTTAAAACAGACGTGTGCTCTTCCGATCTATCGCCTTAATAGTGCTGTGGTCTGATTACGAGTGA
>SRR11062959.436387 0:N:  00
ACGCAGAGTACATGGGGTGCTCTTCCGATCTGGAGCGCGATGGGTGCTGTGGCCTGATTACGAGTGATTGG
>SRR11062959.436392 0:N:  00
ATATGTGCGAGTATGCGGGTTTTACTCCTGTATGCGCAATGAAAGTGAGAGTAGGGAGATTTTGGCTTTGC
>SRR11062959.436394 0:N:  00
GCATGAAGCAGTCCCAAGCTCCGTCAAATACAGGCCACTGGGGCGCAAGGTACCCAGCCCTCAGAGCCAAT
>SRR11062959.436399 0:N:  00
GGATAAAAGAAAAGGCTGTTAAAAAGCAAAGACAACGCTTCCAGGAGCACCTGCCTGCGTCGCGGAGTTCA
>SRR11062959.436401 0:N:  00
CAAATCACAGGAGTGTAATTGAGAATGAGAGAGAGATGCAAACATATTGACTAATAATTTAAATTATTTAA
>SRR11062959.436402 0:N:  00
GCCAGGACGTGGATACTGTACGGTAACGTAAGTGAACTCCTCGACACAGGCAGGTGCTACGGGAAGCGTTG
>SRR11062959.436406 0:N:  00
GTACTTTTTTTTTTTTTTTTTTTTTTTTTTTACCCCCCACAAAAGAAAAAAACCCCCATAAAAGGGGTTTT
>SRR11062959.436408 0:N:  00
ATGTTAGACTCCTTGGTCCGTGTTTCAAGACGGGTCGGTTGGAACCGATTAAGCCAGCATCACAGAAACCG
@Rohit-Satyam Rohit-Satyam changed the title Ribodetector unable to remove rRNA reads from Toxoplasma gondii RNASeq data Ribodetector unable to remove rRNA reads from Toxoplasma gondii scRNASeq SmartSeq2 data Jan 17, 2025
@dawnmy
Copy link
Member

dawnmy commented Jan 17, 2025

Thank you for bringing this to our attention. We did include 28S rRNA in our training dataset, but it may be underrepresented. Specifically, we used curated SSU and LSU rRNA sequences from the Silva database for training. I tried to BLAST the reads you provided against the Silva LSU database, while none could be taxonomically assigned. Therefore, more eukaryotic rRNA sequences from other source will to be included for the training in the future model.

@Rohit-Satyam
Copy link
Author

Rohit-Satyam commented Jan 17, 2025

To make your life easier and ensure we are also helped, here is the link to the database from which you can download all apicomplexans rRNA: https://veupathdb.org/veupathdb/app/search?q=rRNA. Should you need fasta file, I can download and send it to you via email because the size would be huge (i.e. 171,559).

Best

@dawnmy
Copy link
Member

dawnmy commented Jan 17, 2025

To make your life easier and ensure we are also helped, here is the link to the database from which you can download all apicomplexans rRNA: https://veupathdb.org/veupathdb/app/search?q=rRNA. Should you need fasta file, I can download and send it to you via email because the size would be huge (i.e. 171,559).

Best

That would be great! A FASTA file containing all the rRNA would be ideal. Since the file might be too large to email as attachment, could you share it with me via OneDrive? My email: [email protected]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants