On 2025/08/07 0:29, Laurent Vivier wrote:
A race condition between guest driver actions and QEMU timers can lead
to an assertion failure when the guest switches the e1000e from legacy
interrupt mode to MSI-X. If a legacy interrupt delay timer (TIDV or
RDTR) is active, but the guest enables MSI-X before the timer fires,
the pending interrupt cause can trigger an assert in
`e1000e_intmgr_collect_delayed_causes()`.

The function's assertion (`assert(core->delayed_causes == 0)`)
incorrectly assumes that it's impossible for a legacy delayed interrupt
to be pending once the device is in MSI-X mode.

This behavior is incorrect. On a physical device, a driver-initiated
mode switch would mask interrupts, reconfigure the hardware, and clear
any stale interrupt states. The legacy delay timers (TIDV/RDTR) are not
used for moderation in MSI-X mode; the Interrupt Throttle Rate (ITR)
mechanism is used instead. Therefore, any pending interrupt from the
old mode should be ignored.

It is true that triggering assertion is incorrect as per: docs/devel/secure-coding-practices.rst

However, I don't see statements in the datasheet that says mode switch will clear stale interrupts.

The expression "TIDV/RDTR are not used for moderation in MSI-X mode" is also unclear. Behaving drivers may indeed use ITR for that purpose, but the question for us is: what will e1000e do when the guest tries to use TIDV/RDTR in MSI-X mode anyway? That defines the behavior we need to implement.

If the datasheet describes the expected behavior with delayed interrupts in MSI-X, a reference to the datasheet should be made at least in the patch message. Otherwise, perhaps this "if (msix_enabled(core->owner))" is just extraneous and should be removed.

Regards,
Akihiko Odaki


Replace the overly strict assertion with a statement that clears any
stale `delayed_causes`. This correctly models the hardware's behavior
of discarding obsolete interrupt events during a mode change and
prevents the QEMU process from terminating.

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1863
Signed-off-by: Laurent Vivier <lviv...@redhat.com>
---
  hw/net/e1000e_core.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 24138587905b..d0ec892488d4 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -342,7 +342,7 @@ e1000e_intmgr_collect_delayed_causes(E1000ECore *core)
      uint32_t res;
if (msix_enabled(core->owner)) {
-        assert(core->delayed_causes == 0);
+        core->delayed_causes = 0;
          return 0;
      }


Reply via email to