TL;DR: This fixes virtio in a way transparent to guest. We should now be able to revert commits aa8580cd and df0acded19ec which worked around it in a way that's not transparent.
----- commit aa8580cd "pc: memhp: force gaps between DIMM's GPA" introduced gaps in GPA space to work around a bug in virtio, originally reported by Igor, and I quote: ------ QEMU aborts during guest reboot with following backtrace: Breakpoint 1, virtqueue_map_sg (sg=0x555557cbc6c0, addr=0x555557cb86c0, num_sg=0x12, is_write=0x1) at hw/virtio/virtio.c:453 453 error_report("virtio: error trying to map MMIO memory"); (gdb) bt #0 virtqueue_map_sg (sg=0x555557cbc6c0, addr=0x555557cb86c0, num_sg=0x12, is_write=0x1) at hw/virtio/virtio.c:453 #1 0x000055555569b3ef in virtqueue_pop (vq=0x555558a3fab0, elem=0x555557cb86b0) at hw/virtio/virtio.c:520 #2 0x0000555555666611 in virtio_blk_get_request (s=0x5555588c7a00) at hw/block/virtio-blk.c:194 #3 0x00005555556676ec in virtio_blk_handle_output (vdev=0x5555588c7a00, vq=0x555558a3fab0) at hw/block/virtio-blk.c:603 #4 0x000055555569c5c8 in virtio_queue_notify_vq (vq=0x555558a3fab0) at hw/virtio/virtio.c:921 #5 0x000055555569e009 in virtio_queue_host_notifier_read (n=0x555558a3faf8) at hw/virtio/virtio.c:1480 #6 0x000055555591b062 in qemu_iohandler_poll (pollfds=0x555556363800, ret=0x1) at iohandler.c:126 #7 0x000055555591ad33 in main_loop_wait (nonblocking=0x0) at main-loop.c:503 #8 0x00005555557466b5 in main_loop () at vl.c:1902 #9 0x000055555574e69b in main (argc=0x4d, argv=0x7fffffffda18, envp=0x7fffffffdc88) at vl.c:4653 could be reproduced with following options: -enable-kvm -m 1G,slots=250,maxmem=32G -drive if=virtio,file=rhel72 -netdev tap,id=foo,ifname=tap0,script=./qemu-ifup -device virtio-net-pci,id=n1,netdev=foo `for i in $(seq 0 15); do echo -n "-object memory-backend-ram,id=m$i,size=10M -device pc-dimm,id=dimm$i,memdev=m$i "; done` -snapshot -monitor unix:/tmp/m,server,nowait boot and login to guest shell and execute 'reboot' command on the reboot when guest kernel boots, QEMU will abort in virtio. Reproducible in about 80% cases. If QEMU doesn't crash on reboot then try again whit freshly started QEMU. Reason for crashing is that guest allocates buffer that crosses boundary between 2 different memory regions and as result cpu_physical_memory_map() maps GPA to HVA for only head of buffer that belongs to the first region which makes conditon len != sg[i].iov_len true since declared buffer size (sg[i].iov_len) isn't what cpu_physical_memory_map() has been able to map (len), which leads to abort: virtqueue_map_sg() { ... sg[i].iov_base = cpu_physical_memory_map(addr[i], &len, is_write); if (sg[i].iov_base == NULL || len != sg[i].iov_len) { abort() ------ However, the work-around regressed memory hot-unplug for linux guests triggering the following within the guest: ------ ===== kernel BUG at mm/memory_hotplug.c:703! ... [<ffffffff81385fa7>] acpi_memory_device_remove+0x79/0xa5 [<ffffffff81357818>] acpi_bus_trim+0x5a/0x8d [<ffffffff81359026>] acpi_device_hotplug+0x1b7/0x418 === BUG_ON(phys_start_pfn & ~PAGE_SECTION_MASK); === ------ The reason for the crash is that x86-64 linux guest supports memory hotplug in chunks of 128Mb and assumes that memory sections are also 128Mb aligned. However gaps forced between 128Mb DIMMs with backend's natural alignment of 2Mb make the 2nd and following DIMMs not being aligned on 128Mb boundary. ------ Michael S. Tsirkin (6): virtio: introduce virtio_map virtio: switch to virtio_map virtio-blk: convert to virtqueue_map virtio-serial: convert to virtio_map virtio-scsi: convert to virtqueue_map virtio: drop virtqueue_map_sg include/hw/virtio/virtio.h | 3 +-- hw/block/virtio-blk.c | 5 +---- hw/char/virtio-serial-bus.c | 5 +---- hw/scsi/virtio-scsi.c | 16 ++------------ hw/virtio/virtio.c | 52 +++++++++++++++++++++++++++++++++++---------- 5 files changed, 46 insertions(+), 35 deletions(-) -- MST