You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bases are encoded to the .fxb file by first deleting all N’s, and then packing 3 or 4 bases per byte using a variable length code. The N’s can be restored because they always have a quality score of 0, and no other bases do.
This does not hold true for our data. Near as I can tell, N bases always have a quality score of 2 ("#"). Unfortunately, other bases also sometimes have a quality score of 2. No observed bases have a quality below 2.
As-is, the error only becomes evident on decompression:
fastqz error: unexpected end of .fxb
The N bases are left out entirely, causing all subsequent bases to be pushed up (including those in subsequent reads).
Possible fixes, in order of increasing estimated difficulty:
Pre & post process our data outside of fastqz - convert N qualities to 0 ("!") before compression, and convert 0s back to 2 after decompression.
Bundle the above into fastqz, possibly with customizable "offset" value.
Change the encoding schema to store N values.
The text was updated successfully, but these errors were encountered:
From the documentation at https://docs.google.com/document/pub?id=1f-8C-ZfCUTEsO-EqvlcTXQ0M5aYM61Aet902dA8QZZk
This does not hold true for our data. Near as I can tell, N bases always have a quality score of 2 ("#"). Unfortunately, other bases also sometimes have a quality score of 2. No observed bases have a quality below 2.
As-is, the error only becomes evident on decompression:
fastqz error: unexpected end of .fxb
The N bases are left out entirely, causing all subsequent bases to be pushed up (including those in subsequent reads).
Possible fixes, in order of increasing estimated difficulty:
The text was updated successfully, but these errors were encountered: