On 01/03/2013 03:07:49 PM, Alexander Graf wrote:
On 03.01.2013, at 20:54, Scott Wood wrote:
> On 01/03/2013 12:55:26 PM, Alexander Graf wrote:
>> On 22.12.2012, at 03:15, Scott Wood wrote:
>> > The two checks with abort() guard against potential QEMU-internal
>> > problems, but the EOI check stops the guest from causing updates
to queue
>> > position -1 and other havoc if it writes EOI with no interrupt in
>> > service.
>> >
>> > Signed-off-by: Scott Wood <scottw...@freescale.com>
>> Did you ever actually experience this?
>
> Which one? EOI with no interrupt in service can be triggered by
bad guest behavior, and I did see it happen when the guest was
confused by another bug in QEMU's openpic (which is fixed elsewhere),
resulting in an IRQ number of -1 being thrown around.
That's the last hunk, which as I said is fine :).
I would have found the issue in that hunk faster if I had array bounds
checking elsewhere, which was what led me to add it in certain places.
I'm not sure why I didn't add it in the place that would have helped
find the EOI bug, though (IRQ_resetbit). :-P
> The other checks were to try to be more robust against bad IRQ
numbers in general.
>
>> MAX_IRQ should match the memory region size, so we shouldn't be
able to receive any interrupt above it.
>
> Right, that's why I didn't add checking to the MMIO code. In
IRQ_check it could happen due to bad bitmap contents (e.g. after a
checkpoint restore), and in openpic_set_irq() it could happen if some
device raises an IRQ that is out of bounds.
How would a device raise an IRQ that is out of bounds? Devices can
only raise IRQs that are passed down from the init function and that
only creates MAX_INT irq lines.
OK, so it looks like there would need to be a bug in the qdev gpio
mechanism rather than the devices -- but the interface boundary of
openpic.c does take an int rather than a pointer.
>> I might be inclined to accept an assert() there for internal
sanity checking though. The last hunk looks fine.
>
> Assert instead of abort is fine (there seem to be plenty of uses of
both in QEMU), though for the openpic_set_irq() case it would be nice
to be able to print the bad IRQ number before dying.
Well, that's why I was asking where you've seen this happen. It
really shouldn't. Ever. :)
That's why it's assert/abort and not some less severe form of error
handling. :-)
-Scott