> On Fri, 15 Oct 2021 10:30:02 +0100
> Vladimir Medvedkin <vladimir.medved...@intel.com> wrote:
> 
> > +                   m[i * 8 + j] = (rss_key[i] << j)|
> > +                           (uint8_t)((uint16_t)(rss_key[i + 1]) >>
> > +                           (8 - j));
> > +           }
> 
> This ends up being harder than necessary to read. Maybe split into
> multiple statements and/or use temporary variable.
> 
> > +RTE_INIT(rte_thash_gfni_init)
> > +{
> > +   rte_thash_gfni_supported = 0;
> 
> Not necessary in C globals are initialized to zero by default.
> 
> By removing that the constructor can be totally behind #ifdef
> 
> > +__rte_internal
> > +static inline __m512i
> > +__rte_thash_gfni(const uint64_t *mtrx, const uint8_t *tuple,
> > +   const uint8_t *secondary_tuple, int len)
> > +{
> > +   __m512i permute_idx = _mm512_set_epi8(7, 6, 5, 4, 7, 6, 5, 4,
> > +                                           6, 5, 4, 3, 6, 5, 4, 3,
> > +                                           5, 4, 3, 2, 5, 4, 3, 2,
> > +                                           4, 3, 2, 1, 4, 3, 2, 1,
> > +                                           3, 2, 1, 0, 3, 2, 1, 0,
> > +                                           2, 1, 0, -1, 2, 1, 0, -1,
> > +                                           1, 0, -1, -2, 1, 0, -1, -2,
> > +                                           0, -1, -2, -3, 0, -1, -2, -3);
> 
> NAK
> 
> Please don't put the implementation in an inline. This makes it harder
> to support (API/ABI) and blocks other architectures from implementing
> same thing with different instructions.

I don't really understand your reasoning here.
rte_thash_gfni.h is an arch-specific header, which provides
arch-specific optimizations for RSS hash calculation
(Vladimir pls correct me if I am wrong here).
We do have dozens of inline functions that do use arch-specific instructions 
(both x86 and arm)
for different purposes:
sync primitives, memory-ordering, cache manipulations, LPM lookup, TSX, 
power-saving, etc.
That's a usual trade-off taken for performance reasons, when extra function call
costs too much comparing to the operation itself.
Why it suddenly became a problem for that particular case and how exactly it 
blocks other architectures?
Also I don't understand how it makes things harder in terms of API/ABI 
stability.
As I can see this patch doesn't introduce any public structs/unions.
All functions take as arguments just raw data buffers and length.
To summarize - in general, I don't see any good reason why this patch shouldn't 
be allowed.
Konstantin
 



Reply via email to