Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libsquashfs: support lowlevel scanning/mapping #75

Closed
AgentD opened this issue Nov 13, 2020 · 0 comments
Closed

libsquashfs: support lowlevel scanning/mapping #75

AgentD opened this issue Nov 13, 2020 · 0 comments

Comments

@AgentD
Copy link
Owner

AgentD commented Nov 13, 2020

Setting some short term goal posts here:

Several planned features require something like a map of a SquashFS image, i.e. either using a linear scan or using a recursive directory walk, extract the following from an image:

  • A linked list of all data & fragment blocks. For fragment blocks, the layout of the fragments within and in both cases a reference count if referenced by multiple files due to deduplication.
  • A linked list of all meta data blocks with their type and contents.

What would this help with:

  • For gensquashfs this helps implementing the incremental update feature (see Binary patching tools #54). From the new directory tree, the reference counts can be adjusted for deleted files, existing files compared and the remaining blocks simply copied over (or in-place, blocks with refcount==0 removed, new ones added and the inode table rebuilt).
  • Instead of the simple, highlevel approach, sqfsdiff could create an equivalence relation between inodes and compare re-used data blocks only once, causing performance improvements and also helping in finding files that were renamed instead of listing them as deleted and re-added.
  • The planned binary patching (also Binary patching tools #54) would require sqfsdiff to do exactly that anyway, but combine it with something like the Wagner-Fisher algorithm to work out edit transcripts to match the remaining, non-equal blocks.
  • For forensic tools (see Forensic Tools #55) it would be cool to have a way to dump/visualize the exact physical layout of a Squashfs image. Also, SquashFS has plenty of opportunity to hide data. A stubborn scan of the entire image could uncover anything that "doesn't belong there".
  • The rdsquashfs program could unpack files much faster by simply doing a linear walk through the data, throw the blocks through the sqfs_block_processor to decompress them in parallel, and then work backwards what files the blocks belong to and write them to disk several times, or even better create COW reflinks to duplicated files instead (see rdsquashfs feature suggestion: hardlink duplicate files on extract #73).
@AgentD AgentD closed this as completed Jan 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant