libsquashfs: support lowlevel scanning/mapping #75

AgentD · 2020-11-13T12:44:29Z

Setting some short term goal posts here:

Several planned features require something like a map of a SquashFS image, i.e. either using a linear scan or using a recursive directory walk, extract the following from an image:

A linked list of all data & fragment blocks. For fragment blocks, the layout of the fragments within and in both cases a reference count if referenced by multiple files due to deduplication.
A linked list of all meta data blocks with their type and contents.

What would this help with:

For gensquashfs this helps implementing the incremental update feature (see Binary patching tools #54). From the new directory tree, the reference counts can be adjusted for deleted files, existing files compared and the remaining blocks simply copied over (or in-place, blocks with refcount==0 removed, new ones added and the inode table rebuilt).
Instead of the simple, highlevel approach, sqfsdiff could create an equivalence relation between inodes and compare re-used data blocks only once, causing performance improvements and also helping in finding files that were renamed instead of listing them as deleted and re-added.
The planned binary patching (also Binary patching tools #54) would require sqfsdiff to do exactly that anyway, but combine it with something like the Wagner-Fisher algorithm to work out edit transcripts to match the remaining, non-equal blocks.
For forensic tools (see Forensic Tools #55) it would be cool to have a way to dump/visualize the exact physical layout of a Squashfs image. Also, SquashFS has plenty of opportunity to hide data. A stubborn scan of the entire image could uncover anything that "doesn't belong there".
The rdsquashfs program could unpack files much faster by simply doing a linear walk through the data, throw the blocks through the sqfs_block_processor to decompress them in parallel, and then work backwards what files the blocks belong to and write them to disk several times, or even better create COW reflinks to duplicated files instead (see rdsquashfs feature suggestion: hardlink duplicate files on extract #73).

The text was updated successfully, but these errors were encountered:

AgentD closed this as completed Jan 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

libsquashfs: support lowlevel scanning/mapping #75

libsquashfs: support lowlevel scanning/mapping #75

AgentD commented Nov 13, 2020

libsquashfs: support lowlevel scanning/mapping #75

libsquashfs: support lowlevel scanning/mapping #75

Comments

AgentD commented Nov 13, 2020