On 05/21/2014 08:35 PM, Alexander Graf wrote: > > On 21.05.14 12:13, Alexey Kardashevskiy wrote: >> On 05/21/2014 07:50 PM, Alexander Graf wrote: >>> On 21.05.14 11:33, Alexey Kardashevskiy wrote: >>>> On 05/21/2014 07:13 PM, Alexander Graf wrote: >>>>> On 21.05.14 11:11, Michael S. Tsirkin wrote: >>>>>> On Wed, May 21, 2014 at 11:06:09AM +0200, Alexander Graf wrote: >>>>>>> On 21.05.14 10:52, Alexey Kardashevskiy wrote: >>>>>>>> On 05/21/2014 06:40 PM, Alexander Graf wrote: >>>>>>>>> On 15.05.14 11:59, Alexey Kardashevskiy wrote: >>>>>>>>>> Currently SPAPR PHB keeps track of all allocated MSI/MISX >>>>>>>>>> interrupt as >>>>>>>>>> XICS used to be unable to reuse interrupts which becomes a >>>>>>>>>> problem for >>>>>>>>>> dynamic MSI reconfiguration which is happening on guest driver >>>>>>>>>> reload or >>>>>>>>>> PCI hot (un)plug. Another problem is that PHB has a limit of devices >>>>>>>>>> supporting MSI/MSIX (SPAPR_MSIX_MAX_DEVS=32) and there is no good >>>>>>>>>> reason >>>>>>>>>> for that. >>>>>>>>>> >>>>>>>>>> This makes use of new XICS ability to reuse interrupts. >>>>>>>>>> >>>>>>>>>> This removes cached MSI configuration from SPAPR PHB so the first >>>>>>>>>> IRQ >>>>>>>>>> number >>>>>>>>>> of a device is stored in MSI/MSIX config space so there is no >>>>>>>>>> need to >>>>>>>>>> store >>>>>>>>>> this anywhere else. From now on, SPAPR PHB only keeps flags telling >>>>>>>>>> what >>>>>>>>>> type >>>>>>>>>> of interrupt for which device it has configured in order to return >>>>>>>>>> error if >>>>>>>>>> (for example) MSIX was enabled and the guest is trying to disable >>>>>>>>>> MSI >>>>>>>>>> which >>>>>>>>>> it has not enabled. >>>>>>>>>> >>>>>>>>>> This removes a limit for the maximum number of MSIX-enabled devices >>>>>>>>>> per PHB, >>>>>>>>>> now XICS and PCI bus capacity are the only limitation. >>>>>>>>>> >>>>>>>>>> This changes migration stream as it fixes >>>>>>>>>> vmstate_spapr_pci_msi::name >>>>>>>>>> which was >>>>>>>>>> wrong since the beginning. >>>>>>>>>> >>>>>>>>>> This fixed traces to be more informative. >>>>>>>>>> >>>>>>>>>> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> >>>>>>>>>> --- >>>>>>>>>> >>>>>>>>>> In reality either MSIX or MSI is enabled, never both. So I could >>>>>>>>>> remove >>>>>>>>>> msi/msix >>>>>>>>>> bitmaps from this patch, would it make sense? >>>>>>>>> Is this a hard requirement? Does a device have to choose between >>>>>>>>> MSIX and >>>>>>>>> MSI or could it theoretically have both enabled? Is this a PCI >>>>>>>>> limitation, >>>>>>>>> a PAPR/XICS limitation or just a limitation of your implementation? >>>>>>>> My implementation does not have this limitation, I asked if I can >>>>>>>> simplify >>>>>>>> code by introducing one :) >>>>>>>> >>>>>>>> I cannot see any reason why PCI cannot have both MSI and MSIX enabled >>>>>>>> but >>>>>>>> it does not seem to be used by anyone => cannot debug and confirm. >>>>>>>> >>>>>>>> PAPR spec assumes that if the guest tries enabling MSIX when MSI is >>>>>>>> already >>>>>>>> enabled, this is a "change", not enabling both types. But it also >>>>>>>> says MSI >>>>>>>> and MSIX vector numbers are not shared. Hm. >>>>>>> Yeah, I'm not aware of any limitation on hardware here and I'd >>>>>>> rather not impose one. >>>>>>> >>>>>>> Michael, do you know of any hardware that uses MSI *and* MSI-X at >>>>>>> the same time? >>>>>>> >>>>>>> >>>>>>> Alex >>>>>> No, and the PCI spec says: >>>>>> A function is permitted to implement both MSI and MSI-X, but >>>>>> system >>>>>> software is >>>>>> prohibited from enabling both at the same time. If system software >>>>>> enables both at the same time, the result is undefined. >>>>> Ah, cool. So yes Alexey, feel free to impose it :). >>>> Heh. This solves just half of the problem - I still have to keep track of >>>> what device got MSI/MSIX configured via that ibm,change-msi interface. I >>>> was hoping I can store such flag somewhere in a device PCI config space >>>> but >>>> MSI/MSIX enable bit is not good as it is not set when those calls are >>>> made. >>>> And I cannot rely on address/data fields much as the guest can change them >>>> (I already use them to store IRQ numbers and btw it is missing checks when >>>> I read them back for disposal, I'll fix in next round). >>>> >>>> Or on "enable" event I could put IRQ numbers to .data of MSI config space >>>> and on "disable" check if it is not zero, then configuration took place, >>>> then I can remove my msi[]/msix[] flag arrays. If the guest did any change >>>> to MSI/MSIX config space (it does not on SPAPR except weird selftest >>>> cases), I compare .data with what ICS can possibly have and either reject >>>> "disable" or handle it and if it breaks XICS - that's too bad for the >>>> stupid guest. Would that be acceptable? >>> Can't you prohibit the guest from writing to the MSI configuration >>> registers itself? Then you don't need to do sanity checks. >> >> I could for emulated devices but VFIO uses the same code. For example, >> there is an IBM SCSI IPR card which does a "self test". For that, it saves >> MSIX BAR content, does reboot via some backdoor interface and restores MSIX >> BAR. It has been solved for VFIO in the host kernel by restoring MSIX data >> from cached values when guest is trying to restore it with what it thinks >> is actual MSIX data (it is virtualized because of x86). But there is cache > > We already have a cache because we don't access the real PCI registers with > msi_set_message(), no?
For emulated devices there is no cache. And in any case the guest is allowed to write to it... Who knows what AIX does? I do not. -- Alexey