You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AGC looks great. I wanted to see if it'd work also on badly-assembled sequences, e.g. unitigs, and didn't get good compression ratios. Would you say the approach fundamentally wouldn't work for unitigs, or did I miss some parameter tweaks?
I tried to compress 2 human samples unitigs (NA06986 & NA06991) using CHM13v2 as reference, resulting in AGC filesize of 3.6 GB, which is more than the concatenation of the raw gzipped unitigs (2x1.7GB). Cmdline: \time ~/tools/agc/agc create -t 10 chm13v2.0.oneline.fa NA06986.unitigs.fa.gz NA06991.unitigs.fa.gz > NA06986_NA06991.agc. Testing with parameter -s 200 didn't substantially change results.
thanks in advance for any feedback,
Rayan
The text was updated successfully, but these errors were encountered:
Hi Rayan,
AGC was designed for high quality assemblies. Nevertheless, I'm a bit surprised that you report so bad ratios, so we have to take a look at this case. Definitely, we should be better than gzip. :-) I'll let you know when we will have any news.
Best,
Sebastian
AGC does look great! And perhaps I misunderstood, but I think the size difference is due to the AGC file including three genomes (i.e., ref + 2 unitig assemblies), not just two. So AGC would still effectively be smaller at 3.6GB than concatenating the three assemblies.
Hi Sebastian, Agnieszka, Heng,
AGC looks great. I wanted to see if it'd work also on badly-assembled sequences, e.g. unitigs, and didn't get good compression ratios. Would you say the approach fundamentally wouldn't work for unitigs, or did I miss some parameter tweaks?
I tried to compress 2 human samples unitigs (NA06986 & NA06991) using CHM13v2 as reference, resulting in AGC filesize of 3.6 GB, which is more than the concatenation of the raw gzipped unitigs (2x1.7GB). Cmdline:
\time ~/tools/agc/agc create -t 10 chm13v2.0.oneline.fa NA06986.unitigs.fa.gz NA06991.unitigs.fa.gz > NA06986_NA06991.agc
. Testing with parameter-s 200
didn't substantially change results.thanks in advance for any feedback,
Rayan
The text was updated successfully, but these errors were encountered: