On Sun, Jun 06, 2010 at 12:10:07PM +0200, Jan Kiszka wrote:
> Gleb Natapov wrote:
> > On Sun, Jun 06, 2010 at 10:07:48AM +0200, Jan Kiszka wrote:
> >> Gleb Natapov wrote:
> >>> On Sun, Jun 06, 2010 at 09:39:04AM +0200, Jan Kiszka wrote:
> >>>> Gleb Natapov wrote:
> >>>>> On Sat, Jun 05, 2010 at 02:04:01AM +0200, Jan Kiszka wrote:
> >>>>>>> I'd like to also support EOI handling. When the guest clears the
> >>>>>>> interrupt condtion, the EOI callback would be called. This could occur
> >>>>>>> much later than the IRQ delivery time. I'm not sure if we need the
> >>>>>>> result code in that case.
> >>>>>>>
> >>>>>>> If any intermediate device (IOAPIC?) needs to be informed about either
> >>>>>>> delivery or EOI also, it could create a proxy message with its
> >>>>>>> callbacks in place. But we need then a separate opaque field (in
> >>>>>>> addition to payload) to store the original message.
> >>>>>>>
> >>>>>>> struct IRQMsg {
> >>>>>>>  DeviceState *src;
> >>>>>>>  void (*delivery_cb)(IRQMsg *msg, int result);
> >>>>>>>  void (*eoi_cb)(IRQMsg *msg, int result);
> >>>>>>>  void *src_opaque;
> >>>>>>>  void *payload;
> >>>>>>> };
> >>>>>> Extending the lifetime of IRQMsg objects beyond the delivery call stack
> >>>>>> means qemu_malloc/free for every delivery. I think it takes a _very_
> >>>>>> appealing reason to justify this. But so far I do not see any use case
> >>>>>> for eio_cb at all.
> >>>>>>
> >>>>> I dislike use of eoi for reinfecting missing interrupts since
> >>>>> it eliminates use of internal PIC/APIC queue of not yet delivered
> >>>>> interrupts. PIC and APIC has internal queue that can handle two 
> >>>>> elements:
> >>>>> one is delivered, but not yet acked interrupt in isr and another is
> >>>>> pending interrupt in irr. Using eoi callback (or ack notifier as it's
> >>>>> called inside kernel) interrupt will be considered coalesced even if irr
> >>>>> is cleared, but no ack was received for previously delivered interrupt.
> >>>>> But ack notifiers actually has another use: device assignment. There is
> >>>>> a plan to move device assignment from kernel to userspace and for that
> >>>>> ack notifiers will have to be extended to userspace too. If so we can
> >>>>> use them to do irq decoalescing as well. I doubt they should be part
> >>>>> of IRQMsg though. Why not do what kernel does: have globally registered
> >>>>> notifier based on irqchip/pin.
> >>>> I read this twice but I still don't get your plan. Do you like or
> >>>> dislike using EIO for de-coalescing? And how should these notifiers work?
> >>>>
> >>> That's because I confused myself :) I _dislike_ them to be used, but
> >>> since device assignment requires ack notifiers anyway may be it is better
> >>> to introduce one mechanism for device assignmen + de-coalescing instead
> >>> of introducing two different mechanism. Using ack notifiers should be
> >>> easy: RTC registers ack notifier and keep track of delivered interrupts.
> >>> If timer triggers after previews irq was set, but before it was acked
> >>> coalesced counter is incremented. In ack notifier callback coalesced
> >>> counter is checked and if it is not zero new irq is set.
> >> Ack notifier registrations and event deliveries still need to be routed.
> >> Piggy-backing this on IRQ messages may be unavoidable for that reason.
> > It is done in the kernel without piggy-backing.
> 
> As it does not include any IRQ routers in front of the interrupt
> controller. Maybe it works for x86, but it is no generic solution.
> 
x86 has IRQ router in front of interrupt controller inside pci host
bridge.

> Also, periodic timer sources get no information about the fact that
> their interrupt is masked somewhere along the path to the VCPUs and will
> possibly replay countless IRQs when the masking ends, no?
> 
Correct, for that we have mask notifiers in the kernel. Gets ugly be the
minute.

> > 
> >> Anyway, I'm going to post my HPET updates with the infrastructure for
> >> IRQMsg now. Maybe it's helpful to see the other option in reality.
> >>
> > One other think to consider current approach does not always work.
> > Win2K3-64bit-smp and Win2k8-64bit-smp configure RTC interrupt to be
> > broadcasted to all cpus, but only boot cpu does time calculation. With
> > current approach if interrupt is delivered to at least one vcpu
> > it will not be considered coalesced, but if cpu it was delivered to is
> > not cpu that does time accounting then clock will drift.
> 
> That means we would have to fire callbacks per receiving CPU and report
> its number back. Is there a way to find out if we are running such a
> guest without an '-enable-win2k[38]-64bit-smp-rtc-drift-fix'?
> 
Not that I know of.

--
                        Gleb.

Reply via email to