Hi Ruifeng,

Thanks for your review.

On Thur, May 4, 2023 at 2:13PM, Ruifeng Wang wrote:
-----Original Message-----
From: Konstantin Ananyev <konstantin.v.anan...@yandex.ru>
Sent: Monday, May 1, 2023 9:29 PM
To: zhou...@loongson.cn
Cc: dev@dpdk.org; maob...@loongson.cn; qiming.y...@intel.com; 
wenjun1...@intel.com;
Ruifeng Wang <ruifeng.w...@arm.com>; d...@linux.vnet.ibm.com
Subject: Re: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx 
functions

Segmentation fault has been observed while running the
ixgbe_recv_pkts_lro() function to receive packets on the Loongson
3C5000 processor which has 64 cores and 4 NUMA nodes.

 From the ixgbe_recv_pkts_lro() function, we found that as long as the
first packet has the EOP bit set, and the length of this packet is
less than or equal to rxq->crc_len, the segmentation fault will
definitely happen even though on the other platforms, such as X86.

Because when processd the first packet the first_seg->next will be
NULL, if at the same time this packet has the EOP bit set and its
length is less than or equal to rxq->crc_len, the following loop will be 
excecuted:

     for (lp = first_seg; lp->next != rxm; lp = lp->next)
         ;

We know that the first_seg->next will be NULL under this condition. So
the expression of lp->next->next will cause the segmentation fault.

Normally, the length of the first packet with EOP bit set will be
greater than rxq->crc_len. However, the out-of-order execution of CPU
may make the read ordering of the status and the rest of the
descriptor fields in this function not be correct. The related codes are as 
following:

         rxdp = &rx_ring[rx_id];
  #1     staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error);

         if (!(staterr & IXGBE_RXDADV_STAT_DD))
             break;

  #2     rxd = *rxdp;

The sentence #2 may be executed before sentence #1. This action is
likely to make the ready packet zero length. If the packet is the
first packet and has the EOP bit set, the above segmentation fault will happen.

So, we should add rte_rmb() to ensure the read ordering be correct. We
also did the same thing in the ixgbe_recv_pkts() function to make the
rxd data be valid even thougth we did not find segmentation fault in this 
function.

Signed-off-by: Min Zhou <zhou...@loongson.cn>
"Fixes" tag for backport.
OK, I will add the "Fixes" tag in the V3 patch.
---
v2:
- Make the calling of rte_rmb() for all platforms
---
  drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
b/drivers/net/ixgbe/ixgbe_rxtx.c index c9d6ca9efe..302a5ab7ff 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
                staterr = rxdp->wb.upper.status_error;
                if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD)))
                        break;
+
+               rte_rmb();
                rxd = *rxdp;


Indeed, looks like a problem to me on systems with relaxed MO.
Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers.
Thanks, Konstantin.

About a fix - looks right, but a bit excessive to me - as I understand all we 
need here is
to prevent re-ordering by CPU itself.
So rte_smp_rmb() seems enough here.
Agree that rte_rmb() is excessive.
rte_smp_rmb() or rte_atomic_thread_fence(__ATOMIC_ACQUIRE) is enough.
Thanks for your advice. I will compare the rte_smp_rmb(), __atomic_load_n() and rte_atomic_thread_fence() to choose a better one.
And it is better to add a comment to justify the barrier.
OK, I will add a comment for this change.
Or might be just:
staterr = __atomic_load_n(&rxdp->wb.upper.status_error, __ATOMIC_ACQUIRE);


                /*
@@ -2122,6 +2124,7 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf 
**rx_pkts,
uint16_t nb_pkts,
With the proper barrier in place, I think the long comments at the beginning of 
this loop can be removed.
Yes, I think the long comments can be simplified when the proper barrier is already in place.
                if (!(staterr & IXGBE_RXDADV_STAT_DD))
                        break;

+               rte_rmb();
                rxd = *rxdp;

                PMD_RX_LOG(DEBUG, "port_id=%u queue_id=%u rx_id=%u "
--
2.31.1

Best regards,

Min

Reply via email to