> -----Original Message-----
> From: Rong, Leyi
> Sent: Thursday, August 29, 2019 4:05 PM
> To: Zhang, Qi Z <qi.z.zh...@intel.com>; Ye, Xiaolong
> <xiaolong...@intel.com>; Wang, Haiyue <haiyue.w...@intel.com>; Lu,
> Wenzhuo <wenzhuo...@intel.com>
> Cc: dev@dpdk.org; Rong, Leyi <leyi.r...@intel.com>
> Subject: [PATCH v2 6/6] net/ice: switch to Rx flexible descriptor in AVX path
>
> Switch to Rx flexible descriptor format instead of legacy descriptor format.
>
> Signed-off-by: Leyi Rong <leyi.r...@intel.com>
> ---
> drivers/net/ice/ice_rxtx_vec_avx2.c | 232 ++++++++++++++--------------
> 1 file changed, 118 insertions(+), 114 deletions(-)
>
> diff --git a/drivers/net/ice/ice_rxtx_vec_avx2.c
> b/drivers/net/ice/ice_rxtx_vec_avx2.c
> index 5ce29c2a2..158f17d80 100644
> --- a/drivers/net/ice/ice_rxtx_vec_avx2.c
> +++ b/drivers/net/ice/ice_rxtx_vec_avx2.c
> @@ -15,10 +15,10 @@ ice_rxq_rearm(struct ice_rx_queue *rxq) {
> int i;
> uint16_t rx_id;
> - volatile union ice_rx_desc *rxdp;
> + volatile union ice_rx_flex_desc *rxdp;
> struct ice_rx_entry *rxep = &rxq->sw_ring[rxq->rxrearm_start];
>
> - rxdp = rxq->rx_ring + rxq->rxrearm_start;
> + rxdp = (union ice_rx_flex_desc *)rxq->rx_ring + rxq->rxrearm_start;
Since after this patch, all data paths (normal, sse, avx2) are moved to flex
desc,
Ice_rx_desc is not used anymore, so can replace all of them with
ice_rx_flex_desc,
then above convention can be avoid.
<.......>
> * take the two sets of status bits and merge to one @@ -450,20
> +452,22 @@ _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue *rxq, struct
> rte_mbuf **rx_pkts,
> /* get only flag/error bits we want */
> const __m256i flag_bits =
> _mm256_and_si256(status0_7, flags_mask);
> - /* set vlan and rss flags */
> - const __m256i vlan_flags =
> - _mm256_shuffle_epi8(vlan_flags_shuf, flag_bits);
> - const __m256i rss_flags =
> - _mm256_shuffle_epi8(rss_flags_shuf,
> - _mm256_srli_epi32(flag_bits, 11));
> /**
> * l3_l4_error flags, shuffle, then shift to correct adjustment
> * of flags in flags_shuf, and finally mask out extra bits
> */
> __m256i l3_l4_flags = _mm256_shuffle_epi8(l3_l4_flags_shuf,
> - _mm256_srli_epi32(flag_bits, 22));
> + _mm256_srli_epi32(flag_bits, 4));
> l3_l4_flags = _mm256_slli_epi32(l3_l4_flags, 1);
> l3_l4_flags = _mm256_and_si256(l3_l4_flags, cksum_mask);
> + /* set rss and vlan flags */
> + const __m256i rss_vlan_flag_bits =
> + _mm256_srli_epi32(flag_bits, 12);
> + const __m256i rss_flags =
> + _mm256_shuffle_epi8(rss_flags_shuf, rss_vlan_flag_bits);
> + const __m256i vlan_flags =
> + _mm256_shuffle_epi8(vlan_flags_shuf,
> + rss_vlan_flag_bits);
Seems we can "or" rss_flags_shuf and vlan_flags_shuf, so just need to do one
shuffle here to save some CPU cycles?
>
> /* merge flags */
> const __m256i mbuf_flags = _mm256_or_si256(l3_l4_flags,
> --
> 2.17.1