Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation improvement request #217

Open
Eric678 opened this issue Oct 23, 2024 · 6 comments
Open

Documentation improvement request #217

Eric678 opened this issue Oct 23, 2024 · 6 comments
Labels
documentation question Further information is requested

Comments

@Eric678
Copy link

Eric678 commented Oct 23, 2024

New user here, great work - sorely needed for Qubes, a couple of issues:
There is no mention of receiving "other" volumes, question: is the path bundled into the archive or does every vol receive require a separate wyng command with --saveto? (Guessing yes, should be documented) How to format the json for this scenario?
The other is I feel there should be a paragraph right up front describing the wyng archive storage format uses a huge number of inodes. In my case backup drives are mkfs'd with a relatively small number, has never been a issue. A single 10T offline backup drive's 1.2M inodes are wiped out by a 70GB wyng archive, which is not big, using the default chunk size of 128K. Does bring up another question, does wyng send check that there are enough inodes when it is checking space in advance? If it runs out, does it fail atomically and roll back the whole session? Just the last part vol?
Thanks!

@tasket tasket added question Further information is requested documentation labels Oct 23, 2024
@tasket
Copy link
Owner

tasket commented Oct 23, 2024

Hi @Eric678

The local path is not saved in the archive, its specified with the --local command line parameter which is required for most commands. It acts like an absolute base path (or path prefix) so you can restore multiple volumes at once, much like you can backup multiple vols at once. --local can also be added as a default setting to /etc/wyng/wyng.ini to avoid typing it repeatedly.

The --saveto option is for special cases when you need to restore to a very specific path and/or there is no suitable snapshot-capable storage setup on the local system. Only one vol may be restored at a time with --saveto.

wyng archive storage format uses a huge number of inodes. In my case backup drives are mkfs'd with a relatively small number,

That does seem like a special case but also worth a mention in the docs. FWIW, I've done some testing with Ext4, XFS and Btrfs and the only time it seemed to come near to causing inode exhaustion was when the archive chunk size was set to 64KB and the data was especially compressible (hence very small files) and no deduplication was used (dedup has an effect of conserving inodes, if there is substantial duplication). The two obvious ways out of this is doing mkfs with the default ratio of inodes or creating a new Wyng archive with larger chunk size; a third route is to enhance Wyng with an option that lets you set the maximum archive size.

does wyng send check that there are enough inodes when it is checking space in advance? If it runs out, does it fail atomically and roll back the whole session? Just the last part vol?

No, it doesn't do this. It could conceivably provide a warning when a limit is being approached, or warn anytime an fs doesn't have at least 1 inode per 16KB... but most archive utilities (like tar) which can extract large number of small files don't do this.

If it runs out, does it fail atomically and roll back the whole session? Just the last part vol?

It will fail atomically per volume, meaning you may have some volumes completed for a given session number, while the session doesn't exist for other volumes including the vol being backed up when the error occurred. There will also be (in that one archive volume) some unassociated data that will be deleted the next time Wyng runs.

@Eric678
Copy link
Author

Eric678 commented Oct 23, 2024

@tasket thank you for a prompt response.
You did not really answer the first question, with "other" non LVM volumes the path+filename is specified after :|: as part of the volume name for send. I was asking about receiving those volumes - does --saveto have to be used with each volume, in separate wyng receive commands, or is the vol name sufficient? (the vol name may not be the related to the filename, eg a block device).
Yes an offline an backup drive is special, in my case has few big files on it. Yes I have dedup = 1 in wyng.ini. Wyng is rather unusual in that the backup image has the huge number of inodes. Since number of inodes in a filesystem cannot be changed, it won't get fixed here until a new offline backup drive gets rotated in. Needs to be planned for in advance. I need to tar up the wyng archive to put it offline. That process is surprisingly slow.

@tasket
Copy link
Owner

tasket commented Oct 24, 2024

@Eric678 Oh, I see. When using --import-other-from.

You don't have to use --saveto with those vols. Receive will save them under the local path (i.e. the LVM pool) using the vol's archival name. This can create problems since LVM is very restrictive about volume name characters and length; receiving a vol named "hello/there" having a "/" would require either --saveto (using a non-LVM path, or an LVM path with a volume you created manually) or first renaming the archive vol before receiving.

Wyng is rather unusual in that the backup image has the huge number of inodes

FWIW, this is the first time I recall someone complaining about inode exhaustion even though the sparsebundle-like format has been used in Wyng since 2018. Apple has employed this sparsebundle volume strategy in Time Machine to accelerate backup processes, and that's where I got the idea.

One workaround you could do is to create a disk image on the backup drive (i.e. truncate -s 2000G /mnt/backups/wyngcontainer.img ), then format the img normally and copy the archive into it or create a new archive in it.

@Eric678
Copy link
Author

Eric678 commented Oct 25, 2024

@tasket thanks, so if you want to actually restore a backup of a block device you have to use --saveto. That should be mentioned in the section on receive. I notice the --saveto option also appears as --save-to in the doco - which is correct, or both?
I am wanting to restore the /boot and /boot/efi block devices from a full Qubes dom0 backup as described here:
https://github.com/tasket/wyng-util-qubes/issues/4#issuecomment-2436280239

@Eric678
Copy link
Author

Eric678 commented Oct 31, 2024

@tasket should I post up any more what I consider bugs? No comments or questions so far. I have several more and have managed to brick my main archive doing things that are what I would consider perfectly normal. Odd since you have had this up for 7 years? Still is beta though, and extremely fragile. Must be something different about my environment. While I have had recent problems with R4.2.3, I do now have a patched together reliable system - does not show any unexpected software faults except in wyng-backup. Don't dismiss anything I report as a hardware fault.
Another doco issue, the default chunk-factor (1=128K) does not match qubes default install root-pool chunksize (64K) or vm-pool - 512K on my default 1T partitioning. Perhaps that has not been well tested? wyng chunk-factor=4 will much reduce my inode consumption problem. In the case where wyng chunksize is smaller than LVM pool chunksize does wyng just assume all wyng chunks have changed or does it compare each one (once meta says there has been a change) and keep changes in minimal space? If not there must be an awful lot of links to the chunk which is all zeros, if dedup is on. Perhaps that breaks something.
The only other thing that might be different is that my archive is on a RAID1 mounted on a qube (--dest=qubes://...)
Thanks.

@tasket
Copy link
Owner

tasket commented Nov 13, 2024

The archive chunk size has no relationship with TLVM chunk sizes, so there's no compatibility concerns you have to worry about. Wyng uses an internal bitmap for each vol to flag which chunks may have changed; it will step through the bitmap at different rates depending on the LVM or fs block size.

There is no 'zero chunk' to link to. Zero status is just a marker within the archive metadata. Incidentally, the more hardlinks are used in deduplication, the fewer inodes are used by the archive.

And RAID configs should have no effect on compatibility, but of course filesystem configs can. Using fs options that change the write order or priority of journals and data can negatively affect any data format; these options were meant for high-redundancy, high-availability systems and the added throughput you get may not be worth the cost in data corruption. In my experience, both Ext4 and XFS can lose or destroy data this way (moreso with XFS).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants