> -----Original Message-----
> From: Rong, Leyi
> Sent: Thursday, August 29, 2019 4:05 PM
> To: Zhang, Qi Z <qi.z.zh...@intel.com>; Ye, Xiaolong
> <xiaolong...@intel.com>; Wang, Haiyue <haiyue.w...@intel.com>; Lu,
> Wenzhuo <wenzhuo...@intel.com>
> Cc: dev@dpdk.org; Rong, Leyi <leyi.r...@intel.com>
> Subject: [PATCH v2 6/6] net/ice: switch to Rx flexible descriptor in AVX path
> 
> Switch to Rx flexible descriptor format instead of legacy descriptor format.
> 
> Signed-off-by: Leyi Rong <leyi.r...@intel.com>
> ---
>  drivers/net/ice/ice_rxtx_vec_avx2.c | 232 ++++++++++++++--------------
>  1 file changed, 118 insertions(+), 114 deletions(-)
> 
> diff --git a/drivers/net/ice/ice_rxtx_vec_avx2.c
> b/drivers/net/ice/ice_rxtx_vec_avx2.c
> index 5ce29c2a2..158f17d80 100644
> --- a/drivers/net/ice/ice_rxtx_vec_avx2.c
> +++ b/drivers/net/ice/ice_rxtx_vec_avx2.c
> @@ -15,10 +15,10 @@ ice_rxq_rearm(struct ice_rx_queue *rxq)  {
>       int i;
>       uint16_t rx_id;
> -     volatile union ice_rx_desc *rxdp;
> +     volatile union ice_rx_flex_desc *rxdp;
>       struct ice_rx_entry *rxep = &rxq->sw_ring[rxq->rxrearm_start];
> 
> -     rxdp = rxq->rx_ring + rxq->rxrearm_start;
> +     rxdp = (union ice_rx_flex_desc *)rxq->rx_ring + rxq->rxrearm_start;

Since after this patch, all data paths (normal, sse, avx2) are moved to flex 
desc, 
Ice_rx_desc is not used anymore, so can replace all of them with 
ice_rx_flex_desc, 
then above convention can be avoid.

<.......>

>                * take the two sets of status bits and merge to one @@ -450,20
> +452,22 @@ _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue *rxq, struct
> rte_mbuf **rx_pkts,
>               /* get only flag/error bits we want */
>               const __m256i flag_bits =
>                       _mm256_and_si256(status0_7, flags_mask);
> -             /* set vlan and rss flags */
> -             const __m256i vlan_flags =
> -                     _mm256_shuffle_epi8(vlan_flags_shuf, flag_bits);
> -             const __m256i rss_flags =
> -                     _mm256_shuffle_epi8(rss_flags_shuf,
> -                                         _mm256_srli_epi32(flag_bits, 11));
>               /**
>                * l3_l4_error flags, shuffle, then shift to correct adjustment
>                * of flags in flags_shuf, and finally mask out extra bits
>                */
>               __m256i l3_l4_flags = _mm256_shuffle_epi8(l3_l4_flags_shuf,
> -                             _mm256_srli_epi32(flag_bits, 22));
> +                             _mm256_srli_epi32(flag_bits, 4));
>               l3_l4_flags = _mm256_slli_epi32(l3_l4_flags, 1);
>               l3_l4_flags = _mm256_and_si256(l3_l4_flags, cksum_mask);
> +             /* set rss and vlan flags */
> +             const __m256i rss_vlan_flag_bits =
> +                     _mm256_srli_epi32(flag_bits, 12);
> +             const __m256i rss_flags =
> +                     _mm256_shuffle_epi8(rss_flags_shuf, rss_vlan_flag_bits);
> +             const __m256i vlan_flags =
> +                     _mm256_shuffle_epi8(vlan_flags_shuf,
> +                                         rss_vlan_flag_bits);

Seems we can "or" rss_flags_shuf and vlan_flags_shuf, so just need to do one 
shuffle here to save some CPU cycles?

> 
>               /* merge flags */
>               const __m256i mbuf_flags = _mm256_or_si256(l3_l4_flags,
> --
> 2.17.1

Reply via email to