On Sun, Jan 15, 2017 at 12:04:31PM +0000, Pablo de Lara wrote:
> Elastic Flow Distributor (EFD) is a distributor library that uses
> perfect hashing to determine a target/value for a given incoming flow key.
> It has the following advantages:
> 
> - First, because it uses perfect hashing, it does not store
>   the key itself and hence lookup performance is not dependent
>   on the key size.
> 
> - Second, the target/value can be any arbitrary value hence
>   the system designer and/or operator can better optimize service rates
>   and inter-cluster network traffic locating.
> 
> - Third, since the storage requirement is much smaller than a hash-based
>   flow table (i.e. better fit for CPU cache), EFD can scale to
>   millions of flow keys.
>   Finally, with current optimized library implementation performance
>   is fully scalable with number of CPU cores.
> 
> Signed-off-by: Byron Marohn <byron.mar...@intel.com>
> Signed-off-by: Pablo de Lara <pablo.de.lara.gua...@intel.com>
> Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuga...@intel.com>
> Acked-by: Christian Maciocco <christian.macio...@intel.com>
> ---
> +#if (RTE_EFD_VALUE_NUM_BITS == 8 || RTE_EFD_VALUE_NUM_BITS == 16 || \
> +     RTE_EFD_VALUE_NUM_BITS == 24 || RTE_EFD_VALUE_NUM_BITS == 32)
> +#define EFD_LOAD_SI128(val) _mm_load_si128(val)
> +#else
> +#define EFD_LOAD_SI128(val) _mm_lddqu_si128(val)
> +#endif
> +
> +static inline efd_value_t
> +efd_lookup_internal(const struct efd_online_group_entry * const group,
> +             const uint32_t hash_val_a, const uint32_t hash_val_b,
> +             enum rte_efd_compare_function cmp_fn)
> +{
> +     efd_value_t value = 0;
> +     uint32_t i;
> +
> +     switch (cmp_fn) {
> +#ifdef RTE_MACHINE_CPUFLAG_AVX2
> +     case RTE_HASH_COMPARE_AVX2:
> +
> +             i = 0;
> +             __m256i vhash_val_a = _mm256_set1_epi32(hash_val_a);
> +             __m256i vhash_val_b = _mm256_set1_epi32(hash_val_b);
> +

Could you please abstract and move SIMD specific code to another file like other
libraries(example: lib_acl) to enable smooth integration with neon and altivec
SIMD implementations in future.

> +             for (; i < RTE_EFD_VALUE_NUM_BITS; i += 8) {
> +                     __m256i vhash_idx =
> +                                     _mm256_cvtepu16_epi32(EFD_LOAD_SI128(
> +                                     (__m128i const *) &group->hash_idx[i]));
> +                     __m256i vlookup_table = _mm256_cvtepu16_epi32(
> +                                     EFD_LOAD_SI128((__m128i const *)
> +                                     &group->lookup_table[i]));
> +                     __m256i vhash = _mm256_add_epi32(vhash_val_a,
> +                                     _mm256_mullo_epi32(vhash_idx, 
> vhash_val_b));
> +                     __m256i vbucket_idx = _mm256_srli_epi32(vhash,
> +                                     EFD_LOOKUPTBL_SHIFT);
> +                     __m256i vresult = _mm256_srlv_epi32(vlookup_table,
> +                                     vbucket_idx);
> +
> +                     value |= (_mm256_movemask_ps(
> +                             (__m256) _mm256_slli_epi32(vresult, 31))
> +                             & ((1 << (RTE_EFD_VALUE_NUM_BITS - i)) - 1)) << 
> i;
> +             }
> +             break;
> +#endif

Reply via email to