Jan, Alex, ping?
On 12/07/12 15:47, Alexey Kardashevskiy wrote: > On 12/07/12 15:29, Alex Williamson wrote: >> On Thu, 2012-07-12 at 14:58 +1000, Alexey Kardashevskiy wrote: >>> On 12/07/12 14:43, Alex Williamson wrote: >>>> On Thu, 2012-07-12 at 14:38 +1000, Alexey Kardashevskiy wrote: >>>>> On 12/07/12 14:31, Alex Williamson wrote: >>>>>> On Thu, 2012-07-12 at 14:16 +1000, Alexey Kardashevskiy wrote: >>>>>>> On 12/07/12 12:54, Alex Williamson wrote: >>>>>>>> On Wed, 2012-07-11 at 12:25 +1000, Alexey Kardashevskiy wrote: >>>>>>>>> On 11/07/12 02:57, Alex Williamson wrote: >>>>>>>>>> On Tue, 2012-07-10 at 15:51 +1000, Alexey Kardashevskiy wrote: >>>>>>>>>>> The two patches in this set are supposed to add VFIO support for >>>>>>>>>>> POWER. >>>>>>>>>>> >>>>>>>>>>> The first one adds one more step in the initalizaion sequence which >>>>>>>>>>> I am not >>>>>>>>>>> sure is correct. >>>>>>>>>>> >>>>>>>>>>> The second patch adds actual VFIO support. It is not ready to >>>>>>>>>>> submit but >>>>>>>>>>> ready to discuss. I would like to get rid of all #ifdef >>>>>>>>>>> TARGET_PPC64 in patch #2 >>>>>>>>>>> and I wonder if there is any plan to implement some generic EOI >>>>>>>>>>> support code, etc. >>>>>>>>>> >>>>>>>>>> A generic EOI notifier is on my todo list, but I have no idea what >>>>>>>>>> it's >>>>>>>>>> going to look like. As you know, I've got an ioapic specific >>>>>>>>>> notifier >>>>>>>>>> in my tree, you add a spapr specific one. I welcome ideas on how to >>>>>>>>>> create something generic that has a chance of being accepted. >>>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> >>>>>>>>> So far the only platform specific call is xxxx_add_gsi_eoi_notifier. >>>>>>>>> The >>>>>>>>> xxxx_remove_gsi_eoi_notifier only calls notifier_remove, you've got >>>>>>>>> to fix yours >>>>>>>>> ioapic_remove_gsi_eoi_notifier() as it does too much :) >>>>>>>>> >>>>>>>>> >>>>>>>>> The only place for placing "add_eoi" callback I can see right now is >>>>>>>>> QEMUMachine as there is no >>>>>>>>> unified machine interrupt controller - IOAPIC has its own type >>>>>>>>> TYPE_IOAPIC_COMMON and XICS is not >>>>>>>>> even a SysBusDevice. And the callback is not specific for any kind of >>>>>>>>> bus so it cannot go to PCIBus. >>>>>>>>> >>>>>>>>> Does it sound reasonable? >>>>>>>> >>>>>>>> I suspect we'd need to somehow tie it into qemu_irq where both handlers >>>>>>>> and notifiers are allocated so we don't really care the underlying >>>>>>>> implementation. Something like qemu_add_irq_eoi_notifier(qemu_irq >>>>>>>> irq, ...). It's another mess like adding the PCIBus interrupt line to >>>>>>>> gsi effort though. Thanks, >>>>>>> >>>>>>> >>>>>>> Tried. Added add_eoi_notifier() callback to qemu_irq, new IRQ allocator: >>>>>>> qemu_irq *qemu_allocate_irqs2(qemu_irq_handler handler, void *opaque, >>>>>>> int n, >>>>>>> qemu_eoi_add_notifier add_notifier); >>>>>>> and called it from the XICS initialization code. >>>>>>> >>>>>>> It could work out if pci_get_irq() or pci_route_irq_fn() returned >>>>>>> qemu_irq but no, they just return >>>>>>> a global IRQ number (pure or embedded in a struct) and there is no >>>>>>> common way to resolve qemu_irq >>>>>>> (and then add_eoi_notifier()) from that number within vfio_pci. >>>>>> >>>>>> Well GSI and qemu_irq are different address spaces. We still need GSI >>>>>> for any kind of qemu bypass case. >>>>> >>>>> No, that is ok, we also need GSI because XICS and IOAPIC need it in the >>>>> end. >>>>> >>>>>>> May be we could add the callback pointer into PCIINTxRoute? >>>>>> >>>>>> Maybe, but why is this PCI specific? Can't we call it as >>>>>> qemu_add_irq_eoi_notifier(pdev->irq[0], Notifier)? That would work much >>>>>> like qemu_set_irq, extracting the irq number from the IRQState and >>>>>> passing it through to the add_notifier callback for IRQState until it >>>>>> got to the ioapic/pic/xics. >>>>>> >>>>>> int qemu_add_irq_eoi_notifier(qemu_irq *irq, Notifier notifier) >>>>>> { >>>>>> if (!irq || !irq->add_eoi_notifier) >>>>>> return -1; >>>>>> >>>>>> return irq->add_eoi_notifier(irq->opaque, irq->n, notifier); >>>>>> } >>>>>> >>>>> >>>>> Then we will have to entirely replace qemu_allocate_irqs() with >>>>> qemu_allocate_irqs2() and pass some >>>>> non-zero add_eoi_notifier() on every level, at least for PCI for now. I >>>>> would like to avoid that if >>>>> possible - hard to get accepted :) >>>> >>>> Yep, that's why I said it was the same kind of mess as the PCIBus intx >>>> routing. It's intrusive, but qemu_irq is the common interrupt model so >>>> we need to make use of it. >>> >>> There are 2 level of intrusion. >>> >>> 1. Fix PCIINTxRoute to return the GSI's qemu_irq as well. >> >> Slightly confusing because pdev->irq[] is a qemu_irq, but you want the >> actual ioapic/pic/xics qemu_irq w/o walking through the various devices, >> correct? > > Yes. The qemu_irq which corresponds to the GSI which pci_get_irq is returning. > >> I'm not sure what we do once we have it though. Do we get to >> call something like the function outlined above on these "special" >> qemu_irqs? > > They are not special but just "global". This is what hw/pc_piix.c allocates > with qemu_allocate_irqs(). > > Assuming we have properly initialized add_eoi_notifier() callback in the > qemu_irq struct, we can > easily add a notifier via this callback. > > Or I did not get the whole idea. > >> >>> 2. Add add_eoi_notifier to all levels including PCI. As a part of this, we >>> will have to add this >>> callback to all pci_register_bus() calls to reach global interrupts via >>> platform-specific PCI bus. >> >> Just like the PCI INTx route callback, most of these can just be >> passthrough. We just need to get to the end qemu_irq that registered a >> real add notifier. That might make it possible to do it w/o interfering >> too much with other callers, I hope. > > Yes. This is why I propose to extend the PCIINTxRoute struct. > > Actually even adding a callback into QEMUMachine is not that bad idea. > > If a pointer to the struct QEMUMachine was passed into QEMUMachineInitFunc(), > it would be the right > place to init such callback, one per machine but not per every qemu_irq as it > is the same for the > whole machine and will not change. > > >>> I would stay with 1). Is that bad? >> >> It still seems to present a rather large incongruity, but if we're >> planning to cache the qemu_irq there anyway, maybe it's a secondary use. > > Cannot see how it is different from having pci_get_irq() or > pci_route_irq_fn() though. > > -- Alexey