On Wed, Oct 02, 2019 at 04:47:56PM +0200, Cédric Le Goater wrote: > On 02/10/2019 16:21, Greg Kurz wrote: > > On Wed, 2 Oct 2019 11:02:45 +1000 > > David Gibson <da...@gibson.dropbear.id.au> wrote: > > > >> On Tue, Oct 01, 2019 at 06:56:28PM +0200, Greg Kurz wrote: > >>> On Tue, 1 Oct 2019 13:56:10 +0200 > >>> Cédric Le Goater <c...@kaod.org> wrote: > >>> > >>>> On 01/10/2019 13:06, Greg Kurz wrote: > >>>>> On Tue, 1 Oct 2019 10:57:22 +0200 > >>>>> Cédric Le Goater <c...@kaod.org> wrote: > >>>>> > >>>>>> When vCPUs are hotplugged, they are added to the QEMU CPU list before > >>>>>> being fully realized. This can crash the XIVE presenter because the > >>>>>> 'tctx' pointer is not necessarily initialized when looking for a > >>>>>> matching target. > >>>>>> > >>>>> > >>>>> Ouch... :-\ > >>>>> > >>>>>> These vCPUs are not valid targets for the presenter. Skip them. > >>>>>> > >>>>> > >>>>> This likely fixes this specific issue, but more generally, this > >>>>> seems to indicate that using CPU_FOREACH() is rather fragile. > >>>>> > >>>>> What about tracking XIVE TM contexts with a QLIST in xive.c ? > >>>> > >>>> This is a good idea. > >>>> > >>>> On HW, the thread interrupt contexts belong to the XIVE presenter > >>>> subengine. This is the logic doing the CAM line matching to find > >>>> a target for an event notification. So we should model the context > >>>> list below the XiveRouter in QEMU which models both router and > >>>> presenter subengines. We have done without a presenter model for > >>>> the moment and I don't think we will need to introduce one. > >>>> > >>>> This would be a nice improvements of my patchset adding support > >>>> for xive escalations and better support of multi chip systems. > >>>> I have introduced a PNV_CHIP_CPU_FOREACH in this patchset which > >>>> would become useless with a list of tctx under the XIVE interrupt > >>>> controller, XiveRouter, SpaprXive, PnvXive. > >>>> > >>> > >>> I agree. It makes more sense to have the list below the XiveRouter, > >>> rather than relying on CPU_FOREACH(), which looks a bit weird from > >>> a device emulation code perspective. > >> > >> That does sound like a better idea long term. However, for now, I > >> think the NULL check is a reasonable way of fixing the real error > >> we're hitting, so I've applied the patch here. > >> > > > > Fair enough. > > > > Reviewed-by: Greg Kurz <gr...@kaod.org> > > > >> Future cleanups to a different approach remain welcome, of course. > >> > > > > I've started to look. This should simplify Cedric's "add XIVE support > > for KVM guests" series, but I'll wait for your massive cleanup series > > to get merged first. > > > This is a combo patchset. > > > These are prereqs fixes related to the presenter and how CAM lines > are scanned. This is in direct relation with the CPU_FOREACH() issue. > > ppc/xive: Introduce a XivePresenter interface > ppc/xive: Implement the XivePresenter interface > ppc/pnv: Introduce a PNV_CHIP_CPU_FOREACH() helper > ppc/pnv: Introduce a pnv_xive_is_cpu_enabled() helper > ppc/xive: Introduce a XiveFabric interface > ppc/pnv: Implement the XiveFabric interface > ppc/spapr: Implement the XiveFabric interface > ppc/xive: Use the XiveFabric and XivePresenter interfaces > > These are for interrupt resend (escalation). To be noted, the removal > of the get_tctx() XiveRouter handler which has some relation with > the previous subserie. > > ppc/xive: Extend the TIMA operation with a XivePresenter parameter > ppc/pnv: Clarify how the TIMA is accessed on a multichip system > ppc/xive: Move the TIMA operations to the controller model > ppc/xive: Remove the get_tctx() XiveRouter handler > ppc/xive: Introduce a xive_tctx_ipb_update() helper > ppc/xive: Introduce helpers for the NVT id > ppc/xive: Synthesize interrupt from the saved IPB in the NVT > > This is a bug : > > ppc/pnv: Remove pnv_xive_vst_size() routine > ppc/pnv: Dump the XIVE NVT table > ppc/pnv: Skip empty slots of the XIVE NVT table > > This is a model for the block id configuration and better support > for multichip systems : > > ppc/pnv: Introduce a pnv_xive_block_id() helper > ppc/pnv: Extend XiveRouter with a get_block_id() handler > > Misc improvements : > > ppc/pnv: Quiesce some XIVE errors > ppc/xive: Introduce a xive_os_cam_decode() helper > ppc/xive: Check V bit in TM_PULL_POOL_CTX > ppc/pnv: Improve trigger data definition > ppc/pnv: Use the EAS trigger bit when triggering an interrupt from PSI > > > I should move some in front to have them merged before hand if > possible. They have been on the list for 3 months.
Sorry :(. I've been kind of overwhelmed. -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature