Skip to content

Read trimming filtering

Eaton Lab edited this page Sep 27, 2020 · 1 revision

Should we apply trimming/filtering and if so at what thresholds?

My initial thought was that kmer analysis tools like gce expect you to have some amount of sequencing error in your datasets and these are not fit to the same curve that is used to fit the actual data. So maybe filtering is not important. However, I think this is a bad take, because point errors are very different from adapters in the bias they introduce. Trimming could be a big deal.

For example, this link demonstrates clearly that kmer spectra can be quite affected by adapters and quality: https://github.com/wltrimbl/cloud-kmers/blob/master/countingkmers.md

Which program should we use?

Probably cutadapt since its Python based and I've used it before. TODO.