On Mon, Jul 27, 2020 at 08:44:09PM +0800, Jason Wang wrote: > > On 2020/7/27 下午7:43, Michael S. Tsirkin wrote: > > On Mon, Jul 27, 2020 at 04:51:23PM +0800, Jason Wang wrote: > > > On 2020/7/27 下午4:41, Cornelia Huck wrote: > > > > On Mon, 27 Jul 2020 15:38:12 +0800 > > > > Jason Wang<jasow...@redhat.com> wrote: > > > > > > > > > On 2020/7/27 下午2:43, Cornelia Huck wrote: > > > > > > On Sat, 25 Jul 2020 08:40:07 +0800 > > > > > > Jason Wang<jasow...@redhat.com> wrote: > > > > > > > On 2020/7/24 下午11:34, Cornelia Huck wrote: > > > > > > > > On Fri, 24 Jul 2020 11:17:57 -0400 > > > > > > > > "Michael S. Tsirkin"<m...@redhat.com> wrote: > > > > > > > > > On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote: > > > > > > > > > > On Fri, 24 Jul 2020 09:30:58 -0400 > > > > > > > > > > "Michael S. Tsirkin"<m...@redhat.com> wrote: > > > > > > > > > > > On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck > > > > > > > > > > > wrote: > > > > > > > > > > > > When I start qemu with a second virtio-net-ccw device > > > > > > > > > > > > (i.e. adding > > > > > > > > > > > > -device virtio-net-ccw in addition to the autogenerated > > > > > > > > > > > > device), I get > > > > > > > > > > > > a segfault. gdb points to > > > > > > > > > > > > > > > > > > > > > > > > #0 0x000055d6ab52681d in virtio_net_get_config > > > > > > > > > > > > (vdev=<optimized out>, > > > > > > > > > > > > config=0x55d6ad9e3f80 "RT") at > > > > > > > > > > > > /home/cohuck/git/qemu/hw/net/virtio-net.c:146 > > > > > > > > > > > > 146 if (nc->peer->info->type == > > > > > > > > > > > > NET_CLIENT_DRIVER_VHOST_VDPA) { > > > > > > > > > > > > > > > > > > > > > > > > (backtrace doesn't go further) > > > > > > > > > > The core was incomplete, but running under gdb directly > > > > > > > > > > shows that it > > > > > > > > > > is just a bog-standard config space access (first for that > > > > > > > > > > device). > > > > > > > > > > > > > > > > > > > > The cause of the crash is that nc->peer is not set... no > > > > > > > > > > idea how that > > > > > > > > > > can happen, not that familiar with that part of QEMU. > > > > > > > > > > (Should the code > > > > > > > > > > check, or is that really something that should not happen?) > > > > > > > > > > > > > > > > > > > > What I don't understand is why it is set correctly for the > > > > > > > > > > first, > > > > > > > > > > autogenerated virtio-net-ccw device, but not for the second > > > > > > > > > > one, and > > > > > > > > > > why virtio-net-pci doesn't show these problems. The only > > > > > > > > > > difference > > > > > > > > > > between -ccw and -pci that comes to my mind here is that > > > > > > > > > > config space > > > > > > > > > > accesses for ccw are done via an asynchronous operation, so > > > > > > > > > > timing > > > > > > > > > > might be different. > > > > > > > > > Hopefully Jason has an idea. Could you post a full command > > > > > > > > > line > > > > > > > > > please? Do you need a working guest to trigger this? Does > > > > > > > > > this trigger > > > > > > > > > on an x86 host? > > > > > > > > Yes, it does trigger with tcg-on-x86 as well. I've been using > > > > > > > > > > > > > > > > s390x-softmmu/qemu-system-s390x -M s390-ccw-virtio,accel=tcg > > > > > > > > -cpu qemu,zpci=on > > > > > > > > -m 1024 -nographic -device > > > > > > > > virtio-scsi-ccw,id=scsi0,devno=fe.0.0001 > > > > > > > > -drive > > > > > > > > file=/path/to/image,format=qcow2,if=none,id=drive-scsi0-0-0-0 > > > > > > > > -device > > > > > > > > scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 > > > > > > > > -device virtio-net-ccw > > > > > > > > > > > > > > > > It seems it needs the guest actually doing something with the > > > > > > > > nics; I > > > > > > > > cannot reproduce the crash if I use the old advent calendar > > > > > > > > moon buggy > > > > > > > > image and just add a virtio-net-ccw device. > > > > > > > > > > > > > > > > (I don't think it's a problem with my local build, as I see the > > > > > > > > problem > > > > > > > > both on my laptop and on an LPAR.) > > > > > > > It looks to me we forget the check the existence of peer. > > > > > > > > > > > > > > Please try the attached patch to see if it works. > > > > > > Thanks, that patch gets my guest up and running again. So, FWIW, > > > > > > > > > > > > Tested-by: Cornelia Huck<coh...@redhat.com> > > > > > > > > > > > > Any idea why this did not hit with virtio-net-pci (or the > > > > > > autogenerated > > > > > > virtio-net-ccw device)? > > > > > It can be hit with virtio-net-pci as well (just start without peer). > > > > Hm, I had not been able to reproduce the crash with a 'naked' -device > > > > virtio-net-pci. But checking seems to be the right idea anyway. > > > Sorry for being unclear, I meant for networking part, you just need start > > > without peer, and you need a real guest (any Linux) that is trying to > > > access > > > the config space of virtio-net. > > > > > > Thanks > > A pxe guest will do it, but that doesn't support ccw, right? > > > Yes, it depends on the cli actually. > > > > > > I'm still unclear why this triggers with ccw but not pci - > > any idea? > > > I don't test pxe but I can reproduce this with pci (just start a linux guest > without a peer). > > Thanks >
Might be a good addition to a unit test. Not sure what would the test do exactly: just make sure guest runs? Looks like a lot of work for an empty test ... maybe we can poke at the guest config with qtest commands at least. -- MST