Hi Michael, Hongyu, On 2/14/25 1:16 PM, Michael S. Tsirkin wrote: > On Fri, Feb 14, 2025 at 08:56:56AM +0100, Eric Auger wrote: >> Hi, >> >> On 2/14/25 8:21 AM, Ning, Hongyu wrote: >>> >>> >>> On 2025/2/6 16:59, Eric Auger wrote: >>>> Hi, >>>> >>>> On 2/4/25 12:46 PM, Eric Auger wrote: >>>>> Hi, >>>>> >>>>> On 2/3/25 3:48 PM, Michael S. Tsirkin wrote: >>>>>> On Fri, Jan 31, 2025 at 10:53:15AM +0100, Eric Auger wrote: >>>>>>> Hi Kirill, Michael >>>>>>> >>>>>>> On 8/8/24 9:51 AM, Kirill A. Shutemov wrote: >>>>>>>> Hongyu reported a hang on kexec in a VM. QEMU reported invalid memory >>>>>>>> accesses during the hang. >>>>>>>> >>>>>>>> Invalid read at addr 0x102877002, size 2, region '(null)', >>>>>>>> reason: rejected >>>>>>>> Invalid write at addr 0x102877A44, size 2, region '(null)', >>>>>>>> reason: rejected >>>>>>>> ... >>>>>>>> >>>>>>>> It was traced down to virtio-console. Kexec works fine if virtio- >>>>>>>> console >>>>>>>> is not in use. >>>>>>>> >>>>>>>> Looks like virtio-console continues to write to the MMIO even after >>>>>>>> underlying virtio-pci device is removed. >>>>>>>> >>>>>>>> The problem can be mitigated by removing all virtio devices on virtio >>>>>>>> bus shutdown. >>>>>>>> >>>>>>>> Signed-off-by: Kirill A. Shutemov <kirill.shute...@linux.intel.com> >>>>>>>> Reported-by: Hongyu Ning <hongyu.n...@linux.intel.com> >>>>>>> >>>>>>> Gentle ping on that patch that seems to have fallen though the cracks. >>>>>>> >>>>>>> I think this fix is really needed. I have another test case with a >>>>>>> rebooting guest exposed with virtio-net (backed by vhost-net) and >>>>>>> viommu. Since there is currently no shutdown for the virtio-net, on >>>>>>> reboot, the IOMMU is disabled through the native_machine_shutdown()/ >>>>>>> x86_platform.iommu_shutdown() while the virtio-net is still alive. >>>>>>> >>>>>>> Normally device_shutdown() should call virtio-net shutdown before the >>>>>>> IOMMU tear down and we wouldn't see any spurious transactions after >>>>>>> iommu shutdown. >>>>>>> >>>>>>> With that fix, the above test case is fixed and I do not see spurious >>>>>>> vhost IOTLB miss spurious requests. >>>>>>> >>>>>>> For more details, see qemu thread ([PATCH] hw/virtio/vhost: Disable >>>>>>> IOTLB callbacks when IOMMU gets disabled, >>>>>>> https://lore.kernel.org/all/20250120173339.865681-1- >>>>>>> eric.au...@redhat.com/) >>>>>>> >>>>>>> >>>>>>> Reviewed-by: Eric Auger <eric.au...@redhat.com> >>>>>>> Tested-by: Eric Auger <eric.au...@redhat.com> >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> Eric >>>>>>> >>>>>>>> --- >>>>>>>> drivers/virtio/virtio.c | 10 ++++++++++ >>>>>>>> 1 file changed, 10 insertions(+) >>>>>>>> >>>>>>>> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c >>>>>>>> index a9b93e99c23a..6c2f908eb22c 100644 >>>>>>>> --- a/drivers/virtio/virtio.c >>>>>>>> +++ b/drivers/virtio/virtio.c >>>>>>>> @@ -356,6 +356,15 @@ static void virtio_dev_remove(struct device *_d) >>>>>>>> of_node_put(dev->dev.of_node); >>>>>>>> } >>>>>>>> +static void virtio_dev_shutdown(struct device *_d) >>>>>>>> +{ >>>>>>>> + struct virtio_device *dev = dev_to_virtio(_d); >>>>>>>> + struct virtio_driver *drv = drv_to_virtio(dev->dev.driver); >>>>>>>> + >>>>>>>> + if (drv && drv->remove) >>>>>>>> + drv->remove(dev); >>>>>> >>>>>> >>>>>> I am concerned that full remove is a heavyweight operation. >>>>>> Do not want to slow down reboots even more. >>>>>> How about just doing a reset, instead? >>>>> >>>>> I tested with >>>>> >>>>> static void virtio_dev_shutdown(struct device *_d) >>>>> { >>>>> struct virtio_device *dev = dev_to_virtio(_d); >>>>> >>>>> virtio_reset_device(dev); >>>>> } >>>>> >>>>> >>>>> and it fixes my issue. >>>>> >>>>> Kirill, would that fix you issue too? >>> >>> Hi, >>> >>> sorry for my late response, I synced with Kirill offline and did a retest. >>> >>> The issue is still reproduced on my side, kexec will be stuck in case of >>> "console=hvc0" append in kernel cmdline and even with such patch applied. >> >> Thanks for testing! >> >> Michael, it looks like the initial patch from Kyrill is the one that >> fixes both issues. virtio_reset_device() usage does not work for the >> initial bug report while it works for me. Other ideas? >> >> Thanks >> >> Eric > > Ah, wait a second. > > Looks like virtio-console continues to write to the MMIO even after > underlying virtio-pci device is removed. > > Hmm. I am not sure why that is a problem, but I assume some hypervisors just > hang the system if you try to kick them after reset. > Unfortunate that spec did not disallow it. > > If we want to prevent that, we want to do something like this: > > > /* > * Some devices get wedged if you kick them after they are > * reset. Mark all vqs as broken to make sure we don't. > */ > virtio_break_device(dev); > /* > * The below virtio_synchronize_cbs() guarantees that any > * interrupt for this line arriving after > * virtio_synchronize_vqs() has completed is guaranteed to see > * vq->broken as true. > */ > virtio_synchronize_cbs(dev); > dev->config->reset(dev); > > > I assume this still works for you, yes? Would that still been done in the virtio_dev_shutdown()?
Is that what you tested Hongyu? Eric > > > >>> >>> my kernel code base is 6.14.0-rc2. >>> >>> let me know if any more experiments needed. >>> >>> --- >>> drivers/virtio/virtio.c | 8 ++++++++ >>> 1 file changed, 8 insertions(+) >>> >>> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c >>> index ba37665188b5..f9f885d04763 100644 >>> --- a/drivers/virtio/virtio.c >>> +++ b/drivers/virtio/virtio.c >>> @@ -395,6 +395,13 @@ static const struct cpumask >>> *virtio_irq_get_affinity(struct device *_d, >>> return dev->config->get_vq_affinity(dev, irq_vec); >>> } >>> >>> +static void virtio_dev_shutdown(struct device *_d) >>> +{ >>> + struct virtio_device *dev = dev_to_virtio(_d); >>> + >>> + virtio_reset_device(dev); >>> +} >>> + >>> static const struct bus_type virtio_bus = { >>> .name = "virtio", >>> .match = virtio_dev_match, >>> @@ -403,6 +410,7 @@ static const struct bus_type virtio_bus = { >>> .probe = virtio_dev_probe, >>> .remove = virtio_dev_remove, >>> .irq_get_affinity = virtio_irq_get_affinity, >>> + .shutdown = virtio_dev_shutdown, >>> }; >>> >>> int __register_virtio_driver(struct virtio_driver *driver, struct >>> module *owner) >>> -- >>> 2.43.0 >>> >>> >>>> gentle ping. >>>> >>>> this also fixes another issue with qemu vSMMU + virtio-scsi-pci. With >>>> the above addition I get rid of spurious warning in qemu on guest reboot. >>>> >>>> qemu-system-aarch64: virtio: zero sized buffers are not allowed >>>> qemu-system-aarch64: vhost vring error in virtqueue 0: Invalid >>>> argument (22) >>>> >>>> Would you mind if I respin? >>>> >>>> Thanks >>>> >>>> Eric >>>> >>>> >>>> >>>> >>>>> >>>>> Thanks >>>>> >>>>> Eric >>>>>> >>>>>>>> +} >>>>>>>> + >>>>>>>> static const struct bus_type virtio_bus = { >>>>>>>> .name = "virtio", >>>>>>>> .match = virtio_dev_match, >>>>>>>> @@ -363,6 +372,7 @@ static const struct bus_type virtio_bus = { >>>>>>>> .uevent = virtio_uevent, >>>>>>>> .probe = virtio_dev_probe, >>>>>>>> .remove = virtio_dev_remove, >>>>>>>> + .shutdown = virtio_dev_shutdown, >>>>>>>> }; >>>>>>>> int __register_virtio_driver(struct virtio_driver *driver, >>>>>>>> struct module *owner) >>>>>> >>>>> >>>> >>> >