> -----Original Message-----
> From: Zhang, Qi Z
> Sent: Friday, August 30, 2019 7:32 AM
> To: Rong, Leyi <leyi.r...@intel.com>; Ye, Xiaolong <xiaolong...@intel.com>;
> Wang, Haiyue <haiyue.w...@intel.com>; Lu, Wenzhuo <wenzhuo...@intel.com>
> Cc: dev@dpdk.org
> Subject: RE: [PATCH v2 6/6] net/ice: switch to Rx flexible descriptor in AVX 
> path
> 
> 
> >              * take the two sets of status bits and merge to one @@ -
> 450,20
> > +452,22 @@ _ice_recv_raw_pkts_vec_avx2(struct ice_rx_queue *rxq,
> > +struct
> > rte_mbuf **rx_pkts,
> >             /* get only flag/error bits we want */
> >             const __m256i flag_bits =
> >                     _mm256_and_si256(status0_7, flags_mask);
> > -           /* set vlan and rss flags */
> > -           const __m256i vlan_flags =
> > -                   _mm256_shuffle_epi8(vlan_flags_shuf, flag_bits);
> > -           const __m256i rss_flags =
> > -                   _mm256_shuffle_epi8(rss_flags_shuf,
> > -                                       _mm256_srli_epi32(flag_bits, 11));
> >             /**
> >              * l3_l4_error flags, shuffle, then shift to correct adjustment
> >              * of flags in flags_shuf, and finally mask out extra bits
> >              */
> >             __m256i l3_l4_flags = _mm256_shuffle_epi8(l3_l4_flags_shuf,
> > -                           _mm256_srli_epi32(flag_bits, 22));
> > +                           _mm256_srli_epi32(flag_bits, 4));
> >             l3_l4_flags = _mm256_slli_epi32(l3_l4_flags, 1);
> >             l3_l4_flags = _mm256_and_si256(l3_l4_flags, cksum_mask);
> > +           /* set rss and vlan flags */
> > +           const __m256i rss_vlan_flag_bits =
> > +                   _mm256_srli_epi32(flag_bits, 12);
> > +           const __m256i rss_flags =
> > +                   _mm256_shuffle_epi8(rss_flags_shuf,
> rss_vlan_flag_bits);
> > +           const __m256i vlan_flags =
> > +                   _mm256_shuffle_epi8(vlan_flags_shuf,
> > +                                       rss_vlan_flag_bits);
> 
> Seems we can "or" rss_flags_shuf and vlan_flags_shuf, so just need to do one
> shuffle here to save some CPU cycles?
> 

That's make sense literally, will do some benchmarking test for this 
adjustment:).

> >
> >             /* merge flags */
> >             const __m256i mbuf_flags = _mm256_or_si256(l3_l4_flags,
> > --
> > 2.17.1

Reply via email to