On 4/7/22 3:53 PM, Dr. David Alan Gilbert wrote: > * Claudio Fontana (cfont...@suse.de) wrote: >> On 4/5/22 10:35 AM, Dr. David Alan Gilbert wrote: >>> * Claudio Fontana (cfont...@suse.de) wrote: >>>> On 3/28/22 10:31 AM, Daniel P. Berrangé wrote: >>>>> On Sat, Mar 26, 2022 at 04:49:46PM +0100, Claudio Fontana wrote: >>>>>> On 3/25/22 12:29 PM, Daniel P. Berrangé wrote: >>>>>>> On Fri, Mar 18, 2022 at 02:34:29PM +0100, Claudio Fontana wrote: >>>>>>>> On 3/17/22 4:03 PM, Dr. David Alan Gilbert wrote: >>>>>>>>> * Claudio Fontana (cfont...@suse.de) wrote: >>>>>>>>>> On 3/17/22 2:41 PM, Claudio Fontana wrote: >>>>>>>>>>> On 3/17/22 11:25 AM, Daniel P. Berrangé wrote: >>>>>>>>>>>> On Thu, Mar 17, 2022 at 11:12:11AM +0100, Claudio Fontana wrote: >>>>>>>>>>>>> On 3/16/22 1:17 PM, Claudio Fontana wrote: >>>>>>>>>>>>>> On 3/14/22 6:48 PM, Daniel P. Berrangé wrote: >>>>>>>>>>>>>>> On Mon, Mar 14, 2022 at 06:38:31PM +0100, Claudio Fontana wrote: >>>>>>>>>>>>>>>> On 3/14/22 6:17 PM, Daniel P. Berrangé wrote: >>>>>>>>>>>>>>>>> On Sat, Mar 12, 2022 at 05:30:01PM +0100, Claudio Fontana >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> the first user is the qemu driver, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> virsh save/resume would slow to a crawl with a default pipe >>>>>>>>>>>>>>>>>> size (64k). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> This improves the situation by 400%. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Going through io_helper still seems to incur in some penalty >>>>>>>>>>>>>>>>>> (~15%-ish) >>>>>>>>>>>>>>>>>> compared with direct qemu migration to a nc socket to a file. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Signed-off-by: Claudio Fontana <cfont...@suse.de> >>>>>>>>>>>>>>>>>> --- >>>>>>>>>>>>>>>>>> src/qemu/qemu_driver.c | 6 +++--- >>>>>>>>>>>>>>>>>> src/qemu/qemu_saveimage.c | 11 ++++++----- >>>>>>>>>>>>>>>>>> src/util/virfile.c | 12 ++++++++++++ >>>>>>>>>>>>>>>>>> src/util/virfile.h | 1 + >>>>>>>>>>>>>>>>>> 4 files changed, 22 insertions(+), 8 deletions(-) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hello, I initially thought this to be a qemu performance >>>>>>>>>>>>>>>>>> issue, >>>>>>>>>>>>>>>>>> so you can find the discussion about this in qemu-devel: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> "Re: bad virsh save /dev/null performance (600 MiB/s max)" >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> https://lists.gnu.org/archive/html/qemu-devel/2022-03/msg03142.html >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Current results show these experimental averages maximum >>>>>>>>>>>>> throughput >>>>>>>>>>>>> migrating to /dev/null per each FdWrapper Pipe Size (as per QEMU >>>>>>>>>>>>> QMP >>>>>>>>>>>>> "query-migrate", tests repeated 5 times for each). >>>>>>>>>>>>> VM Size is 60G, most of the memory effectively touched before >>>>>>>>>>>>> migration, >>>>>>>>>>>>> through user application allocating and touching all memory with >>>>>>>>>>>>> pseudorandom data. >>>>>>>>>>>>> >>>>>>>>>>>>> 64K: 5200 Mbps (current situation) >>>>>>>>>>>>> 128K: 5800 Mbps >>>>>>>>>>>>> 256K: 20900 Mbps >>>>>>>>>>>>> 512K: 21600 Mbps >>>>>>>>>>>>> 1M: 22800 Mbps >>>>>>>>>>>>> 2M: 22800 Mbps >>>>>>>>>>>>> 4M: 22400 Mbps >>>>>>>>>>>>> 8M: 22500 Mbps >>>>>>>>>>>>> 16M: 22800 Mbps >>>>>>>>>>>>> 32M: 22900 Mbps >>>>>>>>>>>>> 64M: 22900 Mbps >>>>>>>>>>>>> 128M: 22800 Mbps >>>>>>>>>>>>> >>>>>>>>>>>>> This above is the throughput out of patched libvirt with multiple >>>>>>>>>>>>> Pipe Sizes for the FDWrapper. >>>>>>>>>>>> >>>>>>>>>>>> Ok, its bouncing around with noise after 1 MB. So I'd suggest that >>>>>>>>>>>> libvirt attempt to raise the pipe limit to 1 MB by default, but >>>>>>>>>>>> not try to go higher. >>>>>>>>>>>> >>>>>>>>>>>>> As for the theoretical limit for the libvirt architecture, >>>>>>>>>>>>> I ran a qemu migration directly issuing the appropriate QMP >>>>>>>>>>>>> commands, setting the same migration parameters as per libvirt, >>>>>>>>>>>>> and then migrating to a socket netcatted to /dev/null via >>>>>>>>>>>>> {"execute": "migrate", "arguments": { "uri", >>>>>>>>>>>>> "unix:///tmp/netcat.sock" } } : >>>>>>>>>>>>> >>>>>>>>>>>>> QMP: 37000 Mbps >>>>>>>>>>>> >>>>>>>>>>>>> So although the Pipe size improves things (in particular the >>>>>>>>>>>>> large jump is for the 256K size, although 1M seems a very good >>>>>>>>>>>>> value), >>>>>>>>>>>>> there is still a second bottleneck in there somewhere that >>>>>>>>>>>>> accounts for a loss of ~14200 Mbps in throughput. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Interesting addition: I tested quickly on a system with faster cpus >>>>>>>>>> and larger VM sizes, up to 200GB, >>>>>>>>>> and the difference in throughput libvirt vs qemu is basically the >>>>>>>>>> same ~14500 Mbps. >>>>>>>>>> >>>>>>>>>> ~50000 mbps qemu to netcat socket to /dev/null >>>>>>>>>> ~35500 mbps virsh save to /dev/null >>>>>>>>>> >>>>>>>>>> Seems it is not proportional to cpu speed by the looks of it (not a >>>>>>>>>> totally fair comparison because the VM sizes are different). >>>>>>>>> >>>>>>>>> It might be closer to RAM or cache bandwidth limited though; for an >>>>>>>>> extra copy. >>>>>>>> >>>>>>>> I was thinking about sendfile(2) in iohelper, but that probably >>>>>>>> can't work as the input fd is a socket, I am getting EINVAL. >>>>>>> >>>>>>> Yep, sendfile() requires the input to be a mmapable FD, >>>>>>> and the output to be a socket. >>>>>>> >>>>>>> Try splice() instead which merely requires 1 end to be a >>>>>>> pipe, and the other end can be any FD afaik. >>>>>>> >>>>>> >>>>>> I did try splice(), but performance is worse by around 500%. >>>>> >>>>> Hmm, that's certainly unexpected ! >>>>> >>>>>> Any ideas welcome, >>>>> >>>>> I learnt there is also a newer copy_file_range call, not sure if that's >>>>> any better. >>>>> >>>>> You passed len as 1 MB, I wonder if passing MAXINT is viable ? We just >>>>> want to copy everything IIRC. >>>>> >>>>> With regards, >>>>> Daniel >>>>> >>>> >>>> Crazy idea, would trying to use the parallel migration concept for >>>> migrating to/from a file make any sense? >>>> >>>> Not sure if applying the qemu multifd implementation of this would apply, >>>> maybe it could be given another implementation for "toFile", trying to use >>>> more than one cpu to do the transfer? >>> >>> I can't see a way that would help; well, I could if you could >>> somehow have multiple io helper threads that dealt with it. >> >> The first issue I encounter here for both the "virsh save" and "virsh >> restore" scenarios is that libvirt uses fd: migration, not unix: migration. >> QEMU supports multifd for unix:, tcp:, vsock: as far as I can see. >> >> Current save procedure in QMP in short: >> >> {"execute":"migrate-set-capabilities", ...} >> {"execute":"migrate-set-parameters", ...} >> {"execute":"getfd","arguments":{"fdname":"migrate"}, ...} fd=26 >> QEMU_MONITOR_IO_SEND_FD: fd=26 >> {"execute":"migrate","arguments":{"uri":"fd:migrate"}, ...} >> >> >> Current restore procedure in QMP in short: >> >> (start QEMU) >> {"execute":"migrate-incoming","arguments":{"uri":"fd:21"}, ...} >> >> >> Should I investigate changing libvirt to use unix: for save/restore? >> Or should I look into changing qemu to somehow accept fd: for multifd, >> meaning I guess providing multiple fd: uris in the migrate command? > > So I'm not sure this is the right direction; i.e. if multifd is the > right answer to your problem.
Of course, just exploring the space. > However, I think the qemu code probably really really wants to be a > socket. Understood, I'll try to bend libvirt to use unix:/// and see how far I get, Thanks, Claudio > > Dave > >> >> Thank you for your help, >> >> Claudio >>