On 9/20/24 20:59, Wander Lairson Costa wrote:
During testing of SR-IOV, Red Hat QE encountered an issue where the
ip link up command intermittently fails for the igbvf interfaces when
using the PREEMPT_RT variant. Investigation revealed that
e1000_write_posted_mbx returns an error due to the lack of an ACK
from e1000_poll_for_ack.

The underlying issue arises from the fact that IRQs are threaded by
default under PREEMPT_RT. While the exact hardware details are not
available, it appears that the IRQ handled by igb_msix_other must
be processed before e1000_poll_for_ack times out. However,
e1000_write_posted_mbx is called with preemption disabled, leading
to a scenario where the IRQ is serviced only after the failure of
e1000_write_posted_mbx.

To resolve this, we set IRQF_NO_THREAD for the affected interrupt,
ensuring that the kernel handles it immediately, thereby preventing
the aforementioned error.

Reproducer:

     #!/bin/bash

     # echo 2 > /sys/class/net/ens14f0/device/sriov_numvfs
     ipaddr_vlan=3
     nic_test=ens14f0
     vf=${nic_test}v0

     while true; do
            ip link set ${nic_test} mtu 1500
            ip link set ${vf} mtu 1500
            ip link set $vf up
            ip link set ${nic_test} vf 0 vlan ${ipaddr_vlan}
            ip addr add 172.30.${ipaddr_vlan}.1/24 dev ${vf}
            ip addr add 2021:db8:${ipaddr_vlan}::1/64 dev ${vf}
            if ! ip link show $vf | grep 'state UP'; then
                    echo 'Error found'
                    break
            fi
            ip link set $vf down
     done

Signed-off-by: Wander Lairson Costa <wan...@redhat.com>
Reported-by: Yuying Ma <y...@redhat.com>
---
  drivers/net/ethernet/intel/igb/igb_main.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c 
b/drivers/net/ethernet/intel/igb/igb_main.c
index 1ef4cb871452..8a1696d7289f 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -907,7 +907,7 @@ static int igb_request_msix(struct igb_adapter *adapter)
        int i, err = 0, vector = 0, free_vector = 0;
err = request_irq(adapter->msix_entries[vector].vector,
-                         igb_msix_other, 0, netdev->name, adapter);
+                         igb_msix_other, IRQF_NO_THREAD, netdev->name, 
adapter);
        if (err)
                goto err_out;

Thank you for small, localized fix with a good description.
Our VAL will check it also on non-RT OS.
Reviewed-by: Przemek Kitszel <przemyslaw.kits...@intel.com>

PS: for future intel ethernet submissions please split out fixes and
refactors, and tag each commit with the [iwl-net] or [iwl-next] tags

Reply via email to