Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Addresses #173 by using archive file stream instead of random access when reading from tar archives.
Due to the way this is implemented now, we won't be using parallel processes (i.e. the
parallel
parameter is ignored). We could create chunks of files that are adjacent in the archive and split the chunks across multiple processes. However, that in turn would generate issues with the process bar.Ultimately, the file streaming seems to be very performant (possibly because we're not having to open/close individual files?) and I'm not too worried about performance. On my machine I can read the tar archive with 97k hemibrain skeletons in around 3 minutes which isn't too shabby.
In addition to the above this PR contains:
read_swc
more robust against unexpected number of columns