-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple Error Model #5
Comments
A few unknowns: Should the length of the sequences vary? Yes, they should vary. At least to a certain degree. Within all sequencing technologies they are varying and are often not normally distributed. To which base switch if we get an erroneous call. A random other nucleotide? I would guess a random nucl. is good enough for a simple error model. Random should be good enough. Possibly also include INDEL? |
@Ackia the last HiSeq reads I received are all 76bp. Also, in the BEAR article, they state that "Illumina reads are generally uniform in length, reads from other technologies can vary greatly in length" which makes sense since X cycles should give you X base pairs. Indels occur at a really low rate in Illumina data: 2.8 x 10^−6 (errors per base) for R1 insertions and 5.1 x 10^−6 (errors per base) for R1 deletions according to doi.org/10.1186/s12859-016-0976-y |
I agree. I was mixing Illumina up with IonTorrent. My bad. Good progress! |
Closed with 031454b ! 🚀 |
Issue to track the progress on the Roadmap item "Add a simple error model"
I guess that the simplest would be to:
A few unknowns:
* I haven't added a standard deviation parameter. it is hardcoded to 0.01 but can be discussed
The text was updated successfully, but these errors were encountered: