On Mon, 18 Oct 2021 10:40:00 +0000 "Ananyev, Konstantin" <konstantin.anan...@intel.com> wrote:
> > On Fri, 15 Oct 2021 10:30:02 +0100 > > Vladimir Medvedkin <vladimir.medved...@intel.com> wrote: > > > > > + m[i * 8 + j] = (rss_key[i] << j)| > > > + (uint8_t)((uint16_t)(rss_key[i + 1]) >> > > > + (8 - j)); > > > + } > > > > This ends up being harder than necessary to read. Maybe split into > > multiple statements and/or use temporary variable. > > > > > +RTE_INIT(rte_thash_gfni_init) > > > +{ > > > + rte_thash_gfni_supported = 0; > > > > Not necessary in C globals are initialized to zero by default. > > > > By removing that the constructor can be totally behind #ifdef > > > > > +__rte_internal > > > +static inline __m512i > > > +__rte_thash_gfni(const uint64_t *mtrx, const uint8_t *tuple, > > > + const uint8_t *secondary_tuple, int len) > > > +{ > > > + __m512i permute_idx = _mm512_set_epi8(7, 6, 5, 4, 7, 6, 5, 4, > > > + 6, 5, 4, 3, 6, 5, 4, 3, > > > + 5, 4, 3, 2, 5, 4, 3, 2, > > > + 4, 3, 2, 1, 4, 3, 2, 1, > > > + 3, 2, 1, 0, 3, 2, 1, 0, > > > + 2, 1, 0, -1, 2, 1, 0, -1, > > > + 1, 0, -1, -2, 1, 0, -1, -2, > > > + 0, -1, -2, -3, 0, -1, -2, -3); > > > > NAK > > > > Please don't put the implementation in an inline. This makes it harder > > to support (API/ABI) and blocks other architectures from implementing > > same thing with different instructions. > > I don't really understand your reasoning here. > rte_thash_gfni.h is an arch-specific header, which provides > arch-specific optimizations for RSS hash calculation > (Vladimir pls correct me if I am wrong here). Ok, but rte_thash_gfni.h is included on all architectures. > We do have dozens of inline functions that do use arch-specific instructions > (both x86 and arm) > for different purposes: > sync primitives, memory-ordering, cache manipulations, LPM lookup, TSX, > power-saving, etc. > That's a usual trade-off taken for performance reasons, when extra function > call > costs too much comparing to the operation itself. > Why it suddenly became a problem for that particular case and how exactly it > blocks other architectures? > Also I don't understand how it makes things harder in terms of API/ABI > stability. > As I can see this patch doesn't introduce any public structs/unions. > All functions take as arguments just raw data buffers and length. > To summarize - in general, I don't see any good reason why this patch > shouldn't be allowed. > Konstantin The comments about rte_thash_gfni_supported initialization still apply. Why not: #ifdef __GFNI__ RTE_INIT(rte_thash_gfni_init) { if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_GFNI)) rte_thash_gfni_supported = 1; } #endif