Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore performance #53

Open
abbbi opened this issue Aug 21, 2024 · 1 comment
Open

Restore performance #53

abbbi opened this issue Aug 21, 2024 · 1 comment

Comments

@abbbi
Copy link
Collaborator

abbbi commented Aug 21, 2024

hi,

(maybe enable discussions and convert this to an discussion)

See past discussion in this commit:

e872ed2

I did some testing with an amazon S3 today, just to get a impression on how things are going during restore. My setup as follows:

  1. Synchronious GBit uplink (deutsche telekom), speed test for download:
 time curl http://speedtest.belwue.net/10G -o /dev/null
 real    1m31.900s
  1. stock aws bucket, no special settings, Europe (Frankfurt) eu-central-1 (client munich)
  2. 10 GB of mixed data.
  3. Dynamic index backup via proxmox-backup-client

Initial backup performance is OK.

root.pxar: had to backup 9.007 GiB of 11.396 GiB (compressed 6.39 GiB) in 73.12 s (average 126.139 MiB/s)

Restore performance:

 time sudo -E proxmox-backup-client restore "host/cefix/2024-08-21T06:03:30Z" root.pxar /home/abi/source/ --repository xx@[email protected]:pmxtest
real    6m12.991s

26.81 MB/s
Whats really beeing faster is using the pull mechanism to pull an remote s3 store to a local PBS using the PR #48

Syncing datastore 'pmxtest', namespace 'Root' into datastore 'test2', namespace 'Root'
found 1 groups to sync (out of 1 total)
sync snapshot host/cefix/2024-08-21T06:02:04Z
sync archive root.pxar.didx
downloaded 6.39 GiB (100.94 MiB/s)
sync archive catalog.pcat1.didx
downloaded 99.3 KiB (1.239 MiB/s)
[..]
TASK OK
real    1m3.299s

The current state is:

  1. Both the PVE restore and proxmox-backup-client restore seems to be largely sequential in requesting the chunks, (see: https://bugzilla.proxmox.com/show_bug.cgi?id=3163)
  2. The pull mechanism is actually implemented beeng async, thus its alot faster.

Ideas on how to improve this in the proxy:

  1. If an client requests the index, the proxy could before returning it already parse the index and pre-fetch required chunks async in the background
  2. local cache for most-referenced chunks?
@tizbac
Copy link
Owner

tizbac commented Aug 25, 2024

Ideal imho is make a map of what chunk is next to another one , along with top N most used chunks
And always fetch in background next, while current is being used

Will use some ram, but we are in 2024, ram is cheaper than time , and still can be made an option anyway

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants