On 6/22/21 10:06 AM, Philippe Mathieu-Daudé wrote: > On 6/22/21 9:29 AM, Philippe Mathieu-Daudé wrote: >> On 6/21/21 5:36 PM, Fam Zheng wrote: >>>> On 21 Jun 2021, at 16:13, Philippe Mathieu-Daudé <phi...@redhat.com> wrote: >>>> On 6/21/21 3:18 PM, Fam Zheng wrote: >>>>>> On 21 Jun 2021, at 10:32, Philippe Mathieu-Daudé <phi...@redhat.com> >>>>>> wrote: >>>>>> >>>>>> When the NVMe block driver was introduced (see commit bdd6a90a9e5, >>>>>> January 2018), Linux VFIO_IOMMU_MAP_DMA ioctl was only returning >>>>>> -ENOMEM in case of error. The driver was correctly handling the >>>>>> error path to recycle its volatile IOVA mappings. >>>>>> >>>>>> To fix CVE-2019-3882, Linux commit 492855939bdb ("vfio/type1: Limit >>>>>> DMA mappings per container", April 2019) added the -ENOSPC error to >>>>>> signal the user exhausted the DMA mappings available for a container. >>>>>> >>>>>> The block driver started to mis-behave: >>>>>> >>>>>> qemu-system-x86_64: VFIO_MAP_DMA failed: No space left on device >>>>>> (qemu) >>>>>> (qemu) info status >>>>>> VM status: paused (io-error) >>>>>> (qemu) c >>>>>> VFIO_MAP_DMA failed: No space left on device >>>>>> qemu-system-x86_64: block/block-backend.c:1968: blk_get_aio_context: >>>>>> Assertion `ctx == blk->ctx' failed. >>>>> >>>>> Hi Phil, >>>>> >>>>> >>>>> The diff looks good to me, but I’m not sure what exactly caused the >>>>> assertion failure. There is `if (r) { goto fail; }` that handles -ENOSPC >>>>> before, so it should be treated as a general case. What am I missing? >>>> >>>> Good catch, ENOSPC ends setting BLOCK_DEVICE_IO_STATUS_NOSPACE >>>> -> BLOCK_ERROR_ACTION_STOP, so the VM is paused with DMA mapping >>>> exhausted. I don't understand the full "VM resume" path, but this >>>> is not what we want (IO_NOSPACE is to warn the operator to add >>>> more storage and resume, which is pointless in our case, resuming >>>> won't help until we flush the mappings). >>>> >>>> IIUC what we want is return ENOMEM to set BLOCK_DEVICE_IO_STATUS_FAILED. >>> >>> I agree with that. It just makes me feel there’s another bug in the >>> resuming code path. Can you get a backtrace? >> >> It seems the resuming code path bug has been fixed elsewhere: >> >> (qemu) info status >> info status >> VM status: paused (io-error) >> (qemu) c >> c >> 2021-06-22T07:27:00.745466Z qemu-system-x86_64: VFIO_MAP_DMA failed: No >> space left on device >> (qemu) info status >> info status >> VM status: paused (io-error) >> (qemu) c >> c >> 2021-06-22T07:27:12.458137Z qemu-system-x86_64: VFIO_MAP_DMA failed: No >> space left on device >> (qemu) c >> c >> 2021-06-22T07:27:13.439167Z qemu-system-x86_64: VFIO_MAP_DMA failed: No >> space left on device >> (qemu) c >> c >> 2021-06-22T07:27:14.272071Z qemu-system-x86_64: VFIO_MAP_DMA failed: No >> space left on device >> (qemu) >> > > I tested all releases up to v4.1.0 and could not trigger the > blk_get_aio_context() assertion. Building using --enable-debug. > IIRC Gentoo is more aggressive, so I'll restart using -O2.
Took 4h30 to test all releases with -O3, couldn't reproduce :( I wish I hadn't postponed writing an Ansible test script... On v1 Michal said he doesn't have access to the machine anymore, so I'll assume the other issue got fixed elsewhere.