05/07/2024 19:43, Yoan Picchi:
> On 7/4/24 21:31, David Marchand wrote:
> > On Wed, Jul 3, 2024 at 7:13 PM Yoan Picchi <yoan.pic...@arm.com> wrote:
> >> --- /dev/null
> >> +++ b/lib/hash/compare_signatures_arm_pvt.h
> > 
> > I guess pvt stands for private.
> > No need for such suffix, this header won't be exported in any case.
> 
> pvt do stand for private, yes. I had a look at the other lib and what 
> they used to state a header as private. Several (rcu, ring and stack) 
> use _pvt so it looks like that's might be the standard? If no, then how 
> am I supposed to differentiate a public and a private header?

Public headers are prefixed with rte_
We should not use _pvt


> >> +#if RTE_HASH_BUCKET_ENTRIES <= 8
> >> +       case RTE_HASH_COMPARE_NEON: {
> >> +               uint16x8_t vmat, vsig, x;
> >> +               int16x8_t shift = {0, 1, 2, 3, 4, 5, 6, 7};
> >> +               uint16_t low, high;
> >> +
> >> +               vsig = vld1q_dup_u16((uint16_t const *)&sig);
> >> +               /* Compare all signatures in the primary bucket */
> >> +               vmat = vceqq_u16(vsig, vld1q_u16((uint16_t const 
> >> *)prim_bucket_sigs));
> >> +               x = vshlq_u16(vandq_u16(vmat, vdupq_n_u16(0x0001)), shift);
> >> +               low = (uint16_t)(vaddvq_u16(x));
> >> +               /* Compare all signatures in the secondary bucket */
> >> +               vmat = vceqq_u16(vsig, vld1q_u16((uint16_t const 
> >> *)sec_bucket_sigs));
> >> +               x = vshlq_u16(vandq_u16(vmat, vdupq_n_u16(0x0001)), shift);
> >> +               high = (uint16_t)(vaddvq_u16(x));
> >> +               *hitmask_buffer = low | high << RTE_HASH_BUCKET_ENTRIES;
> >> +
> >> +               }
> >> +               break;
> >> +#endif
> >> +       default:
> >> +               for (unsigned int i = 0; i < RTE_HASH_BUCKET_ENTRIES; i++) 
> >> {
> >> +                       *hitmask_buffer |= (sig == prim_bucket_sigs[i]) << 
> >> i;
> >> +                       *hitmask_buffer |=
> >> +                               ((sig == sec_bucket_sigs[i]) << i) << 
> >> RTE_HASH_BUCKET_ENTRIES;
> >> +               }
> >> +       }
> >> +}
> > 
> > IIRC, this code is copied in all three headers.
> > It is a common scalar version, so the ARM code could simply call the
> > "generic" implementation rather than copy/paste.
> 
> Out of the three files, only two versions are the same: generic and arm. 
> Intel's version do have some padding added (given it's sparse).
> I prefer to keep a scalar version in the arm implementation because 
> that's what match the legacy implementation. We used to be able to 
> choose (at runtime) to use the scalar path even if we had neon. In 
> practice the choice ends up being made from #defines, but as far as this 
> function goes, it is a runtime decision.

I have no strong opinion.



Reply via email to