On 22.01.2011, at 15:14, Edgar E. Iglesias wrote: > On Sat, Jan 22, 2011 at 02:13:03PM +0100, Aurelien Jarno wrote: >> On Tue, Jan 18, 2011 at 01:58:25PM +0100, Jan Kiszka wrote: >>> On 2011-01-18 13:05, Alexander Graf wrote: >>>> >>>> On 18.01.2011, at 10:08, Gerd Hoffmann wrote: >>>> >>>>> Hi, >>>>> >>>>>>> Worse might also be that unknown issue that force you to inject an IRQ >>>>>>> here. We don't know. That's probably worst. >>>>>> >>>>>> Well, IIRC the issue was that usually a level high interrupt line would >>>>>> simply retrigger an interrupt after enabling the interrupt line in the >>>>>> APIC again. >>>>> >>>>> edge triggered interrupts wouldn't though. >>>> >>>> The code change doesn't change anything for edge triggered interrupts - >>>> they work fine. Only !msi (== level) ones are broken. >>>> >>>>> >>>>>> Unless my memory completely fails on me, that didn't happen, >>>>>> so I added the manual retrigger on a partial command ACK in ahci code. >>>>> >>>>> That re-trigger smells like you are not clearing the interrupt line where >>>>> you should. For starters try calling ahci_check_irq() after guest writes >>>>> to PORT_IRQ_STAT. >>>> >>>> The problem happened when I had the following: >>>> >>>> ahci irq bits = 0 >>>> <events happen> >>>> ahci irq bits = 1 | 2 >>>> irq line trigger >>>> guest clears 1 >>>> ahci bits = 2 >>>> <no irq line trigger, since we're still irq high> >>>> >>>> On normal hardware, the high state of the irq line would simply trigger >>>> another interrupt in the guest when it gets ACKed on the LAPIC. Somehow it >>>> doesn't get triggered here. I don't see why clearing the interrupt line >>>> would help? It's not what real hardware would do, no? >>>> >>> >>> No, real devices would simply leave the line asserted as long as there >>> is a reason to. >>> >>> So again my question: With which irqchips do you see this effect, kvm's >>> in-kernel model and/or qemu's user space model? If there is an issue >>> with retriggering a CPU interrupt while the source is still asserted, >>> that probably needs to be fixed. >>> >> >> I guess it might be related to a problem identified long time ago on >> some targets, and that leads to the following #ifdef in i8259.c: >> >> | /* all targets should do this rather than acking the IRQ in the cpu */ >> | #if defined(TARGET_MIPS) || defined(TARGET_PPC) || defined(TARGET_ALPHA) >> >> For your information it has been fixed on MIPS in commit >> 4de9b249d37c1b382cc3e5a21fad1b4a11cec2fa. >> >> Basically when an interrupt line is high, the interrupt is getting >> delivered to the CPU. This part works correctly on x86. The CPU will >> take a corresponding action, basically either disabling the interrupt >> at the CPU or controller level or doing something on the device so that >> it lower its IRQ line. This new IRQ line level should then propagate >> through the various controllers, which should also lower their IRQ line >> if no other interrupt line are active. This ACK process should then >> continue up to the CPU. > > I totally agree. > > >> For x86 the interrupt state is cleared as soon as the interrupt is >> signaled to the CPU (see cpu-exec.c line 407), therefore if an interrupt >> is still pending, it won't be seen by the CPU. It's probably what you >> observed with AHCI. > > Yes, essentially, the CPU_INTERRUPT_HARD signal is an input to the CPU. > The CPU cannot drive it directly. To lower it, it must take some kind > of indirect action (IO or whatever) to clear the condition that is > forcing it high. Any assignments to clear or set the CPU_INTERRUPT_HARD > signal from within the CPU core are likely wrong.. > > FWIW, PPC code in cpu-exec.c:443 looks suspicious aswell...
How's that suspicious? As long as pending_interrupts is only reset on actual interrupt delivery, everything's fine. Interrupts like the decrementor are not level based on some PPCs, so the actual semantics are implementation dependent. Either way, I agree that getting interrupts right is very hard :). Alex