On Thu, Nov 14, 2019 at 09:58:47AM +1100, Alexey Kardashevskiy wrote: > > > On 13/11/2019 21:00, Michael S. Tsirkin wrote: > > On Wed, Nov 13, 2019 at 03:44:28PM +1100, Alexey Kardashevskiy wrote: > >> > >> > >> On 12/11/2019 18:08, Michael S. Tsirkin wrote: > >>> On Tue, Nov 12, 2019 at 02:53:49PM +1100, Alexey Kardashevskiy wrote: > >>>> Hi! > >>>> > >>>> I am enabling IOMMU for virtio in the pseries firmware (SLOF) and seeing > >>>> problems, one of them is SLOF does SCSI bus scan, then it stops the > >>>> virtio-scsi by clearing MMIO|IO|BUSMASTER from PCI_COMMAND (as SLOF > >>>> stopped using the devices) and when this happens, I see unassigned > >>>> memory access (see below) which happens because disabling busmaster > >>>> disables IOMMU and QEMU cannot access the rings to do some shutdown. And > >>>> when this happens, the device does not come back even if SLOF re-enables > >>>> it. > >>> > >>> In fact clearing bus master should disable ring access even > >>> without the IOMMU. > >>> Once you do this you should not wait for rings to be processed, > >>> it is safe to assume they won't be touched again and just > >>> free up any buffers that have not been used. > >>> > >>> Why don't you see this without IOMMU? > >> > >> Because without IOMMU, virtio can always access rings, it does not need > >> bus master address space for that. > > > > Right and that's a bug in virtio scsi. E.g. virtio net checks > > bus mastering before each access. > > You have to be specific - virtio scsi in the guest or in QEMU?
If a device accesses memory with bus master on, it's buggy. > > > Which is all well and good, but we can't just break the world > > so I guess we first need to fix SLOF, and then add > > a compat property. And maybe keep it broken for > > legacy ... > > > >> > >>> It's a bug I think, probably there to work around buggy guests. > >>> > >>> So pls fix this in SLOF and then hopefully we can drop the > >>> work arounds and have clearing bus master actually block DMA. > >> > >> > >> Laszlo suggested writing 0 to the status but this does not seem helping, > >> with both ioeventfd=true/false. It looks like setting/clearing busmaster > >> bit confused memory region caches in QEMU's virtio. I am confused which > >> direction to keep digging to, any suggestions? Thanks, > >> > > > > to clarify you reset after setting bus master? right? > > > I was talking about clearing the bus master, and where I call that > virtio reset does not matter. Thanks, > > so bus master =0 reset bus master =1 device does not recover? > > > > > >> > >>> > >>>> Hacking SLOF to not clear BUSMASTER makes virtio-scsi work but it is > >>>> hardly a right fix. > >>>> > >>>> Is this something expected? Thanks, > >>>> > >>>> > >>>> Here is the exact command line: > >>>> > >>>> /home/aik/pbuild/qemu-garrison2-ppc64/ppc64-softmmu/qemu-system-ppc64 \ > >>>> > >>>> -nodefaults \ > >>>> > >>>> -chardev stdio,id=STDIO0,signal=off,mux=on \ > >>>> > >>>> -device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \ > >>>> > >>>> -mon id=MON0,chardev=STDIO0,mode=readline \ > >>>> > >>>> -nographic \ > >>>> > >>>> -vga none \ > >>>> > >>>> -enable-kvm \ > >>>> -m 2G \ > >>>> > >>>> -device > >>>> virtio-scsi-pci,id=vscsi0,iommu_platform=on,disable-modern=off,disable-legacy=on > >>>> \ > >>>> -drive id=DRIVE0,if=none,file=img/u1804-64le.qcow2,format=qcow2 \ > >>>> > >>>> -device scsi-disk,id=scsi-disk0,drive=DRIVE0 \ > >>>> > >>>> -snapshot \ > >>>> > >>>> -smp 1 \ > >>>> > >>>> -machine pseries \ > >>>> > >>>> -L /home/aik/t/qemu-ppc64-bios/ \ > >>>> > >>>> -trace events=qemu_trace_events \ > >>>> > >>>> -d guest_errors \ > >>>> > >>>> -chardev socket,id=SOCKET0,server,nowait,path=qemu.mon.ssh59518 \ > >>>> > >>>> -mon chardev=SOCKET0,mode=control > >>>> > >>>> > >>>> > >>>> Here is the backtrace: > >>>> > >>>> Thread 5 "qemu-system-ppc" hit Breakpoint 8, unassigned_mem_accepts > >>>> (opaque=0x0, addr=0x5802, size=0x2, is_write=0x0, attrs=...) at /home/ > >>>> aik/p/qemu/memory.c:1275 > >>>> 1275 return false; > >>>> #0 unassigned_mem_accepts (opaque=0x0, addr=0x5802, size=0x2, > >>>> is_write=0x0, attrs=...) at /home/aik/p/qemu/memory.c:1275 > >>>> #1 0x00000000100a8ac8 in memory_region_access_valid (mr=0x1105c230 > >>>> <io_mem_unassigned>, addr=0x5802, size=0x2, is_write=0x0, attrs=...) at > >>>> /home/aik/p/qemu/memory.c:1377 > >>>> #2 0x00000000100a8c88 in memory_region_dispatch_read (mr=0x1105c230 > >>>> <io_mem_unassigned>, addr=0x5802, pval=0x7ffff550d410, op=MO_16, > >>>> attrs=...) at /home/aik/p/qemu/memory.c:1418 > >>>> #3 0x000000001001cad4 in address_space_lduw_internal_cached_slow > >>>> (cache=0x7fff68036fa0, addr=0x2, attrs=..., result=0x0, > >>>> endian=DEVICE_LITTLE_ENDIAN) at /home/aik/p/qemu/memory_ldst.inc.c:211 > >>>> #4 0x000000001001cc84 in address_space_lduw_le_cached_slow > >>>> (cache=0x7fff68036fa0, addr=0x2, attrs=..., result=0x0) at > >>>> /home/aik/p/qemu/memory_ldst.inc.c:249 > >>>> #5 0x000000001019bd80 in address_space_lduw_le_cached > >>>> (cache=0x7fff68036fa0, addr=0x2, attrs=..., result=0x0) at > >>>> /home/aik/p/qemu/include/exec/memory_ldst_cached.inc.h:56 > >>>> #6 0x000000001019c10c in lduw_le_phys_cached (cache=0x7fff68036fa0, > >>>> addr=0x2) at /home/aik/p/qemu/include/exec/memory_ldst_phys.inc.h:91 > >>>> #7 0x000000001019d86c in virtio_lduw_phys_cached (vdev=0x118b9110, > >>>> cache=0x7fff68036fa0, pa=0x2) at > >>>> /home/aik/p/qemu/include/hw/virtio/virtio-access.h:166 > >>>> #8 0x000000001019e648 in vring_avail_idx (vq=0x118c2720) at > >>>> /home/aik/p/qemu/hw/virtio/virtio.c:302 > >>>> #9 0x000000001019f5bc in virtio_queue_split_empty (vq=0x118c2720) at > >>>> /home/aik/p/qemu/hw/virtio/virtio.c:581 > >>>> #10 0x000000001019f838 in virtio_queue_empty (vq=0x118c2720) at > >>>> /home/aik/p/qemu/hw/virtio/virtio.c:612 > >>>> #11 0x00000000101a8fa8 in virtio_queue_host_notifier_aio_poll > >>>> (opaque=0x118c2798) at /home/aik/p/qemu/hw/virtio/virtio.c:3389 > >>>> #12 0x000000001092679c in run_poll_handlers_once (ctx=0x11212e40, > >>>> timeout=0x7ffff550d7d8) at /home/aik/p/qemu/util/aio-posix.c:520 > >>>> #13 0x0000000010926aec in try_poll_mode (ctx=0x11212e40, > >>>> timeout=0x7ffff550d7d8) at /home/aik/p/qemu/util/aio-posix.c:607 > >>>> #14 0x0000000010926c2c in aio_poll (ctx=0x11212e40, blocking=0x1) at > >>>> /home/aik/p/qemu/util/aio-posix.c:639 > >>>> #15 0x000000001091fe0c in aio_wait_bh_oneshot (ctx=0x11212e40, > >>>> cb=0x1016f35c <virtio_scsi_dataplane_stop_bh>, opaque=0x118b9110) at > >>>> /home/aik/p/qemu/util/aio-wait.c:71 > >>>> #16 0x000000001016fa60 in virtio_scsi_dataplane_stop (vdev=0x118b9110) > >>>> at /home/aik/p/qemu/hw/scsi/virtio-scsi-dataplane.c:211 > >>>> #17 0x0000000010684740 in virtio_bus_stop_ioeventfd (bus=0x118b9098) at > >>>> /home/aik/p/qemu/hw/virtio/virtio-bus.c:245 > >>>> #18 0x0000000010688290 in virtio_pci_stop_ioeventfd (proxy=0x118b0fa0) > >>>> at /home/aik/p/qemu/hw/virtio/virtio-pci.c:292 > >>>> #19 0x00000000106891e8 in virtio_write_config (pci_dev=0x118b0fa0, > >>>> address=0x4, val=0x100100, len=0x4) at > >>>> /home/aik/p/qemu/hw/virtio/virtio-pci.c:613 > >>>> #20 0x00000000105b7228 in pci_host_config_write_common > >>>> (pci_dev=0x118b0fa0, addr=0x4, limit=0x100, val=0x100100, len=0x4) at > >>>> /home/aik/p/qemu/hw/pci/pci_host.c:81 > >>>> #21 0x00000000101f7bdc in finish_write_pci_config (spapr=0x11217200, > >>>> buid=0x800000020000000, addr=0x4, size=0x4, val=0x100100, > >>>> rets=0x7e7533e0) at /home/aik/p/qemu/hw/ppc/spapr_pci.c:192 > >>>> #22 0x00000000101f7cec in rtas_ibm_write_pci_config (cpu=0x11540df0, > >>>> spapr=0x11217200, token=0x2017, nargs=0x5, args=0x7e7533cc, nret=0x1, > >>>> rets=0x7e7533e0) at /home/aik/p/qemu/hw/ppc/spapr_pci.c:216 > >>>> #23 0x00000000101f5860 in spapr_rtas_call (cpu=0x11540df0, > >>>> spapr=0x11217200, token=0x2017, nargs=0x5, args=0x7e7533cc, nret=0x1, > >>>> rets=0x7e7533e0) at /home/aik/p/qemu/hw/ppc/spapr_rtas.c:416 > >>>> #24 0x00000000101ee214 in h_rtas (cpu=0x11540df0, spapr=0x11217200, > >>>> opcode=0xf000, args=0x7ffff4cf0030) at > >>>> /home/aik/p/qemu/hw/ppc/spapr_hcall.c:1214 > >>>> #25 0x00000000101f0524 in spapr_hypercall (cpu=0x11540df0, > >>>> opcode=0xf000, args=0x7ffff4cf0030) at > >>>> /home/aik/p/qemu/hw/ppc/spapr_hcall.c:2014 > >>>> #26 0x000000001033bff0 in kvm_arch_handle_exit (cs=0x11540df0, > >>>> run=0x7ffff4cf0000) at /home/aik/p/qemu/target/ppc/kvm.c:1684 > >>>> #27 0x00000000100cc7c8 in kvm_cpu_exec (cpu=0x11540df0) at > >>>> /home/aik/p/qemu/accel/kvm/kvm-all.c:2391 > >>>> #28 0x000000001008edf8 in qemu_kvm_cpu_thread_fn (arg=0x11540df0) at > >>>> /home/aik/p/qemu/cpus.c:1318 > >>>> #29 0x000000001092c704 in qemu_thread_start (args=0x11588d90) at > >>>> /home/aik/p/qemu/util/qemu-thread-posix.c:519 > >>>> #30 0x00007ffff7b58070 in start_thread (arg=0x7ffff550ebf0) at > >>>> pthread_create.c:335 > >>>> #31 0x00007ffff7aa3a70 in clone () at > >>>> ../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:96 > >>>> (gdb) > >>>> > >>>> -- > >>>> Alexey > >>> > >> > >> -- > >> Alexey > > > > -- > Alexey