Hi Stephen,

Thanks for reviewing

On 15/10/2021 18:58, Stephen Hemminger wrote:
On Fri, 15 Oct 2021 10:30:02 +0100
Vladimir Medvedkin <vladimir.medved...@intel.com> wrote:

+                       m[i * 8 + j] = (rss_key[i] << j)|
+                               (uint8_t)((uint16_t)(rss_key[i + 1]) >>
+                               (8 - j));
+               }

This ends up being harder than necessary to read. Maybe split into
multiple statements and/or use temporary variable.

+RTE_INIT(rte_thash_gfni_init)
+{
+       rte_thash_gfni_supported = 0;

Not necessary in C globals are initialized to zero by default.

By removing that the constructor can be totally behind #ifdef

+__rte_internal
+static inline __m512i
+__rte_thash_gfni(const uint64_t *mtrx, const uint8_t *tuple,
+       const uint8_t *secondary_tuple, int len)
+{
+       __m512i permute_idx = _mm512_set_epi8(7, 6, 5, 4, 7, 6, 5, 4,
+                                               6, 5, 4, 3, 6, 5, 4, 3,
+                                               5, 4, 3, 2, 5, 4, 3, 2,
+                                               4, 3, 2, 1, 4, 3, 2, 1,
+                                               3, 2, 1, 0, 3, 2, 1, 0,
+                                               2, 1, 0, -1, 2, 1, 0, -1,
+                                               1, 0, -1, -2, 1, 0, -1, -2,
+                                               0, -1, -2, -3, 0, -1, -2, -3);

NAK

Please don't put the implementation in an inline. This makes it harder
to support (API/ABI) and blocks other architectures from implementing
same thing with different instructions.


By making this function not inline, its performance drops by about 2 times. Compiler optimization (at least with respect to the len argument) helps a lot in the implementation.


--
Regards,
Vladimir

Reply via email to