On Mon, 30 Mar 2020 at 14:17, BALATON Zoltan <bala...@eik.bme.hu> wrote: > > On Mon, 30 Mar 2020, Peter Maydell wrote: > > In dcr_write_pcie() we take the iothread lock around a call to > > pcie_host_mmcfg_udpate(). This is an incorrect attempt to deal with > > the bug fixed in commit 235352ee6e73d7716, where we were not taking > > the iothread lock before calling device dcr read/write functions. > > (It's not sufficient locking, because although the other cases in the > > switch statement won't assert, there is no locking which prevents > > multiple guest CPUs from trying to access the PPC460EXPCIEState > > struct at the same time and corrupting data.) > > Even though there's only a single CPU on sam460ex and PCIe is mostly > unused, with this patch I could no more reproduce a problem that we had > before with some programs crashing within guest under AmigaOS for unknown > reason. That problem happened randomly (although I could reproduce it > before) so I'm not sure if this fixed it or something else (more likely > commit 235352ee6e) or will just resurface later but at least this seems to > work so > > Tested-by: BALATON Zoltan <bala...@eik.bme.hu> > > Thanks for fixing it.
Thanks for the testing. I'm not sure why a single-cpu setup would have problems but I guess some device has a bottom-half or timer callback that will run in the iothread context, in which case it could race with the vcpu thread doing a dcr access. As you say, probably 235352ee6e rather than this change that's fixed it, assuming we really have fixed it. > > Unfortunately with commit 235352ee6e73d7716 we are now trying > > to recursively take the iothread lock, which will assert: > > > > $ qemu-system-ppc -M sam460ex --display none > > ** > > > > ERROR:/home/petmay01/linaro/qemu-from-laptop/qemu/cpus.c:1830:qemu_mutex_lock_iothread_impl: > > assertion failed: (!qemu_mutex_iothread_locked()) > > Aborted (core dumped) > > > > Remove the locking within dcr_write_pcie(). > > > > Fixes: 235352ee6e73d7716 > > Signed-off-by: Peter Maydell <peter.mayd...@linaro.org> > > --- > > I did a grep of hw/ppc and didn't see anything else that was doing > > its own locking inside a dcr read/write fn. > > I think we needed to add locking here because it asserted otherwise but I > don't remember the details now. Yeah, the memory-region adjustment done under the pcie_host_mmcfg_update() function will assert that the iothread lock is held. The locking just needs to be one level further up in the callstack. thanks -- PMM