Release v0.0.4 · ksahlin/ultra

Fixed issue #4
Added an option --use_NAM_seeds which changes the seeding from MEMs to NAMs (with strobemers). NAM seeding makes uLTRA faster and produces smaller intermediate files. The memory usage with --use_NAM_seeds is "fixed" regardless of the number of cores/threads (about ~80-90Gb for human genome) compared to default option which grows with number of cores. Therefore, using --use_NAM_seeds results in lower peak memory usage over the default option if using more than 18 cores, and higher memory usage otherwise. The alignment accuracy is largely the same -- NAM seeds decrease the accuracy of about 0.01%-0.05% compared to MEMs (i.e., 1 alignment in every 2,000-10,000). Due to faster runtime and smaller disk usage, at a cost of high memory usage, I recommend --use_NAM_seeds for large datasets (>5M reads) if running on nodes with >90Gb memory and more than 20 cores.

Provide feedback