On Wed, 20 Mar 2024 at 11:20, Philippe Mathieu-Daudé <phi...@linaro.org> wrote: > > On 20/2/24 16:19, Thomas Huth wrote: > > On 20/02/2024 16.08, Philippe Mathieu-Daudé wrote: > >> Have s390x always deliver NMI to the first CPU, > >> remove the @cpu_index argument from handler, > >> rename API as nmi_trigger() (not monitor specific). > > > > Could you please add some rationale here why this is needed / desired? > > I'm not sure it is desired... I'm trying to get the NMI delivery > working in heterogeneous machine, but now I'm wondering whether > hw/core/nmi.c was designed with that in mind or likely not. > > I suppose in a complex machine you explicitly wire IRQ lines such > NMI, so they are delivered to a particular INTC or CPU core, and > there is no "broadcast this signal to all listeners registered > for NMI events".
I think in a complex heterogenous machine you do want the monitor NMI command to do something sensible, but the definition of "sensible" is going to be machine-specific: probably it will be "raise NMI in some way on some core in the main application processor cluster", and it's the machine model that's going to know what "sensible" is for that machine. The current hw/core/nmi.c code is a bit odd because it's partly working with a cpu_index and partly not: the code passes cpu_index around, but in practice for the QMP command the user can't set which CPU to operate on, and for everything except s390 the implementation doesn't care anyway. My impression from the IRC discussion is that it's not really necessary for the S390 that the monitor user be able to specify which CPU to NMI (and in any case you can only do that from the HMP command, not the QMP command, AIUI), so getting rid of that weird inconsistency makes sense to me: and that's what this patchset is doing. What NMI probably ought to be is board-specific: so it's like having some notional front panel switch labeled "NMI", and the board gets to decide what that means (which is usually going to be "send some NMI like interrupt to the first CPU in the main cluster", but could be something else). It doesn't need to be like a front panel switch with a rotary-selector for 'pick a CPU' plus a button for "send NMI to that CPU". In fact we're quite close to "it's a board thing" already, because almost every implementation of the TYPE_NMI interface is actually a machine model. (The exceptions are hw/intc/m68k_irqc.c, hw/m68k/q800-glue.c and hw/misc/macio/gpio.c.) So I think that: * we should indeed drop the cpu_index stuff, per this patch: it's unnecessary cruft we don't really use * we should look at whether the three classes listed above which implement TYPE_NMI on a non-machine-model are really the right way to do that, i.e. whether it would be a lot of effort to effectively switch to having nmi_monitor_handler be a simple method on MachineClass. Not walking the QOM tree would make the NMI infrastructure rather simpler. (But I just looked at the macio case, and it's inside a PCI device, so at best that's a bunch of clunky plumbing.) * failing that, we should look at whether we should really continue to walk the whole QOM tree calling methods on every TYPE_NMI object, or whether we can say "once we've found one implementation we're done". This also depends on those three non-MachineClass implementations, because obviously there's only ever one MachineClass object in the system. This is kind of useful for heterogenous boards which use the m68k or ppc devices listed above (seems highly unlikely), but it would mean you can override the default "those objects handle NMI" by having your heterogenous board implement TYPE_NMI, and then since it's earlier in the QOM tree that will be the method called, not the ones on specific devices. (This one I think we can easily do -- my quick check suggests that TYPE_M68K_IRQ is only used in the m68k virt board, TYPE_GLUE is only used in the m68k q800 board, and TYPE_MACIO_GPIO is only used in the ppc mac99 board. So in fact in all cases there's only ever one TYPE_NMI interface present in the system.) The last two aren't blockers for heterogenous-system work, though: they just seem to me like nice cleanup of this interface. thanks -- PMM