<snip>

> 
> Rx descriptor is 16B/32B in size and consists of multiple words.
> The word that includes DD field should be read first. Read result with DD bit
> set indicates the rest part in a descriptor is valid.
Suggest rewording as follows:
Rx descriptor is 16B/32B in size. If the DD bit is set, it indicates that the 
rest of the descriptor words have valid values. Hence, the word containing DD 
bit must be read first before reading the rest of the descriptor words.

> 
> In functions for simple Rx, the descriptor is not read atomically in whole. On
> weaker ordered systems like aarch64, read of the word that includes DD field
> could be reordered after read of other words.
> In this case, some words could be invalid data.
Since the entire descriptor is not read atomically, on relaxed memory ordered 
systems like Aarch64, read of the word containing DD field could be reordered 
after read of other words.

> 
> Read barrier is inserted between read of the word with DD field and read of
> other words. The barrier ensures what fetched is correct descriptor data.
Suggest capturing the performance impact, so it is clearly documented.

> 
> Fixes: 7b0cf70135d1 ("net/i40e: support ARM platform")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Ruifeng Wang <ruifeng.w...@arm.com>
With the above comments,
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>

> ---
> The change should not impact performance on x86 as acquire fence is ignored
> on x86.
> 
>  drivers/net/i40e/i40e_rxtx.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c index
> 8329cbdd4e..c4cd6b6b60 100644
> --- a/drivers/net/i40e/i40e_rxtx.c
> +++ b/drivers/net/i40e/i40e_rxtx.c
> @@ -746,6 +746,12 @@ i40e_recv_pkts(void *rx_queue, struct rte_mbuf
> **rx_pkts, uint16_t nb_pkts)
>                       break;
>               }
> 
> +             /**
> +              * Use acquire fence to ensure that qword1 which includes DD
> +              * bit is loaded before loading of other descriptor words.
> +              */
> +             rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> +
>               rxd = *rxdp;
>               nb_hold++;
>               rxe = &sw_ring[rx_id];
> @@ -862,6 +868,12 @@ i40e_recv_scattered_pkts(void *rx_queue,
>                       break;
>               }
> 
> +             /**
> +              * Use acquire fence to ensure that qword1 which includes DD
> +              * bit is loaded before loading of other descriptor words.
> +              */
> +             rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> +
>               rxd = *rxdp;
>               nb_hold++;
>               rxe = &sw_ring[rx_id];
> --
> 2.25.1

Reply via email to