> Adam Kalisz <adam.kal...@notnullmakers.com> hat am 24.06.2025 12:22 CEST > geschrieben: > Hi Fabian,
CCing the list again, assuming it got dropped by accident. > the CPU usage is higher, I see about 400% for the restore process. I > didn't investigate the original much because it's unbearably slow. > > Yes, having configurable CONCURRENT_REQUESTS and max_blocking_threads > would be great. However we would need to wire it up all the way to > qmrestore or similar or ensure it is read from some env vars. I didn't > feel confident to introduce this kind of infrastructure as a first time > contribution. we can guide you if you want, but it's also possible to follow-up on our end with that as part of applying the change. > The writer to disk is single thread still so a CPU that can ramp up a > single core to a high frequency/ IPC will usually do better on the > benchmarks. I think that limitation is no longer there on the QEMU side nowadays, but it would likely require some more changes to actually make use of multiple threads submitting IO. > What are the chances of this getting accepted more or less as is? proper review and discussion of potential follow-ups (no matter who ends up doing them) would require submitting a properly signed-off patch and a CLA - see https://pve.proxmox.com/wiki/Developer_Documentation#Software_License_and_Copyright Fabian > On Tue, 2025-06-24 at 09:28 +0200, Fabian Grünbichler wrote: > > > > > Adam Kalisz via pve-devel <pve-devel@lists.proxmox.com> hat am > > > 23.06.2025 18:10 CEST geschrieben: > > > Hi list, > > > > Hi! > > > > > before I go through all the hoops to submit a patch I wanted to > > > discuss > > > the current form of the patch that can be found here: > > > > > > https://github.com/NOT-NULL-Makers/proxmox-backup-qemu/commit/e91f09cfd1654010d6205d8330d9cca71358e030 > > > > > > The speedup process was discussed here: > > > > > > https://forum.proxmox.com/threads/abysmally-slow-restore-from-backup.133602/ > > > > > > The current numbers are: > > > > > > With the most current snapshot of a VM with 10 GiB system disk and > > > 2x > > > 100 GiB disks with random data: > > > > > > Original as of 1.5.1: > > > 10 GiB system: duration=11.78s, speed=869.34MB/s > > > 100 GiB random 1: duration=412.85s, speed=248.03MB/s > > > 100 GiB random 2: duration=422.42s, speed=242.41MB/s > > > > > > With the 12-way concurrent fetching: > > > > > > 10 GiB system: duration=2.05s, speed=4991.99MB/s > > > 100 GiB random 1: duration=100.54s, speed=1018.48MB/s > > > 100 GiB random 2: duration=100.10s, speed=1022.97MB/s > > > > Those numbers do look good - do you also have CPU usage stats before > > and after? > > > > > The hardware is on the PVE side: > > > 2x Intel Xeon Gold 6244, 1 TB RAM, 2x 100 Gbps Mellanox, 14x > > > Samsung > > > NVMe 3,8 TB drives in RAID10 using mdadm/ LVM-thin. > > > > > > On the PBS side: > > > 2x Intel Xeon Gold 6334, 1 TB RAM, 2x 100 Gbps Mellanox, 8x Samsung > > > NVMe in RAID using 4 ZFS mirrors with recordsize 1M, lz4 > > > compression. > > > > > > Similar or slightly better speeds were achieved on Hetzner AX52 > > > with > > > AMD Ryzen 7 7700 with 64 GB RAM and 2x 1 TB NVMe in stripe on PVE > > > with > > > recordsize 16k connected to another Hetzner AX52 using a 10 Gbps > > > connection. The PBS has normal NVMe ZFS mirror again with > > > recordsize > > > 1M. > > > > > > On bigger servers a 16-way concurrency was even better on smaller > > > servers with high frequency CPUs 8-way concurrency performed > > > better. > > > The 12-way concurrency is a compromise. We seem to hit a bottleneck > > > somewhere in the realm of TLS connection and shallow buffers. The > > > network on the 100 Gbps servers can support up to about 3 GBps > > > (almost > > > 20 Gbps) of traffic in a single TCP connection using mbuffer. The > > > storage can keep up with such a speed. > > > > This sounds like it might make sense to make the number of threads > > configurable (the second lower count can probably be derived from > > it?) > > to allow high-end systems to make the most of it, without overloading > > smaller setups. Or maybe deriving it from the host CPU count would > > also work? > > > > > Before I submit the patch, I would also like to do the most up to > > > date > > > build but I have trouble updating my build environment to reflect > > > the > > > latest commits. What do I have to put in my /etc/apt/sources.list > > > to be > > > able to install e.g. librust-cbindgen-0.27+default-dev librust- > > > http- > > > body-util-0.1+default-dev librust-hyper-1+default-dev and all the > > > rest? > > > > We are currently in the process of rebasing all our repositories on > > top > > of the upcoming Debian Trixie release. The built packages are not yet > > available for public testing, so you'd either need to wait a bit (in > > the > > order of a few weeks at most), or submit the patches for the current > > stable Bookworm-based version and let us forward port them. > > > > > This work was sponsored by ČMIS s.r.o. and consulted with the > > > General > > > Manager Václav Svátek (ČMIS), Daniel Škarda (NOT NULL Makers > > > s.r.o.) > > > and Linux team leader Roman Müller (ČMIS). > > > > Nice! Looking forward to the "official" patch submission! > > Fabian _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel