Hi Stephen, On 19/10/2021 03:15, Stephen Hemminger wrote:
On Mon, 18 Oct 2021 10:40:00 +0000 "Ananyev, Konstantin" <konstantin.anan...@intel.com> wrote:On Fri, 15 Oct 2021 10:30:02 +0100 Vladimir Medvedkin <vladimir.medved...@intel.com> wrote:+ m[i * 8 + j] = (rss_key[i] << j)| + (uint8_t)((uint16_t)(rss_key[i + 1]) >> + (8 - j)); + }This ends up being harder than necessary to read. Maybe split into multiple statements and/or use temporary variable.+RTE_INIT(rte_thash_gfni_init) +{ + rte_thash_gfni_supported = 0;Not necessary in C globals are initialized to zero by default. By removing that the constructor can be totally behind #ifdef+__rte_internal +static inline __m512i +__rte_thash_gfni(const uint64_t *mtrx, const uint8_t *tuple, + const uint8_t *secondary_tuple, int len) +{ + __m512i permute_idx = _mm512_set_epi8(7, 6, 5, 4, 7, 6, 5, 4, + 6, 5, 4, 3, 6, 5, 4, 3, + 5, 4, 3, 2, 5, 4, 3, 2, + 4, 3, 2, 1, 4, 3, 2, 1, + 3, 2, 1, 0, 3, 2, 1, 0, + 2, 1, 0, -1, 2, 1, 0, -1, + 1, 0, -1, -2, 1, 0, -1, -2, + 0, -1, -2, -3, 0, -1, -2, -3);NAK Please don't put the implementation in an inline. This makes it harder to support (API/ABI) and blocks other architectures from implementing same thing with different instructions.I don't really understand your reasoning here. rte_thash_gfni.h is an arch-specific header, which provides arch-specific optimizations for RSS hash calculation (Vladimir pls correct me if I am wrong here).Ok, but rte_thash_gfni.h is included on all architectures.
Ok, I'll rework the patch to move x86 + avx512 related things into x86 arch specific header. Would that suit?
We do have dozens of inline functions that do use arch-specific instructions (both x86 and arm) for different purposes: sync primitives, memory-ordering, cache manipulations, LPM lookup, TSX, power-saving, etc. That's a usual trade-off taken for performance reasons, when extra function call costs too much comparing to the operation itself. Why it suddenly became a problem for that particular case and how exactly it blocks other architectures? Also I don't understand how it makes things harder in terms of API/ABI stability. As I can see this patch doesn't introduce any public structs/unions. All functions take as arguments just raw data buffers and length. To summarize - in general, I don't see any good reason why this patch shouldn't be allowed. KonstantinThe comments about rte_thash_gfni_supported initialization still apply. Why not: #ifdef __GFNI__ RTE_INIT(rte_thash_gfni_init) { if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_GFNI)) rte_thash_gfni_supported = 1; } #endif
Agree, I'll reflect this changes in v3. -- Regards, Vladimir