Hi Gedric, Appreciate your insight. Please see the comment inline below.
On Mon, Nov 21, 2022 at 8:57 PM Cédric Le Goater <c...@kaod.org> wrote: > > On 11/21/22 12:57, Pingfan Liu wrote: > > Sorry that forget a subject. > > > > On Mon, Nov 21, 2022 at 7:54 PM Pingfan Liu <kernelf...@gmail.com> wrote: > >> > >> Hello Powerpc folks, > >> > >> I encounter an kdump bug, which I bisect and pin commit 174db9e7f775 > >> ("powerpc/pseries/pci: Add support of MSI domains to PHB hotplug") > >> In that case, using Fedora 36 as host, the mentioned commit as the > >> guest kernel, and virto-block disk, the kdump kernel will hang: > > The host kernel should be using the PowerNV platform and not pseries > or are you running a nested L2 guest on KVM/pseries L1 ? > Host kernel ran on P9 bare metal. And here PowerKVM is used. > And as far as I remember, the patch above only impacts the IBM PowerVM > hypervisor, not KVM, and PHB hotplug, or kdump induces some hot-plugging > I am not aware of. > Sorry that my information is not clear. The suspect series is "[PATCH 00/31] powerpc: Modernize the PCI/MSI support", and in the main line, beginning from commit 786e5b102a00 ("powerpc/pseries/pci: Introduce __find_pe_total_msi()"). I tried to bisect, and the commit a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains") even hangs the first kernel. So I went ahead to find the next functional change on pseries, which is commit 174db9e7f775 ("powerpc/pseries/pci: Add support of MSI domains to PHB hotplug"). > Also, if indeed, this is a L2 guest, the XIVE interrupt controller is > emulated in QEMU, "info pic" should return: > > ... > irqchip: emulated > > >> > >> [ 0.000000] Kernel command line: elfcorehdr=0x22c00000 > >> no_timer_check net.ifnames=0 console=tty0 console=hvc0,115200n8 > >> irqpoll maxcpus=1 noirqdistrib reset_devices cgroup_disable=memory > >> numa=off udev.children-max=2 ehea.use_mcs=0 panic=10 > >> kvm_cma_resv_ratio=0 transparent_hugepage=never novmcoredd > >> hugetlb_cma=0 > >> ... > >> [ 7.763260] virtio_blk virtio2: 32/0/0 default/read/poll queues > >> [ 7.771391] virtio_blk virtio2: [vda] 20971520 512-byte logical > >> blocks (10.7 GB/10.0 GiB) > >> [ 68.398234] systemd-udevd[187]: virtio2: Worker [190] > >> processing SEQNUM=1193 is taking a long time > >> [ 188.398258] systemd-udevd[187]: virtio2: Worker [190] > >> processing SEQNUM=1193 killed > >> > >> > >> During my test, I found that in very rare cases, the kdump can success > >> (I guess it may be due to the cpu id). And if using either maxcpus=2 > >> or using scsi-disk, then kdump can also success. And before the > >> mentioned commit, kdump can also success. > >> > >> The attachment contains the xml to reproduce that bug. > >> > >> Do you have any ideas? > > Most certainly an interrupt not being delivered. You can check the status > on the host with : > > virsh qemu-monitor-command --hmp <domain> "info pic" > OK, I will try to occupy a P9 machine and have a shot. I will update the info later. Thanks, Pingfa > > > Thanks, > > C.