On Tue, Mar 10, 2020 at 8:14 PM Medvedkin, Vladimir <vladimir.medved...@intel.com> wrote: > > Hi Jerin,
Hi Vladimir, > > Are we missing __attribute__((aligned(64))) here? > > Agree. While modern compilers align __m512i by default, some old could > failure to align. Please correct me if I'm wrong. Yes. > > +} rte_zmm_t; > > IMO, Due to legacy reason, we have selected rte_xmm_t, rte_ymm_t for > 128 and 256 operations in public APIs[1] > > As for me, since these functions are inlined, prototype should be changed to > uint32_t ip[4] instead of passing vector type as an argument. OK. Makes sense. > # Not sure where xmm_t and ymm_t and new zmm_t come from? Is this name > x86 arch-specific? > > Yes, that's why they are in arch/x86/rte_vect.h See the last comment. > > If so, > why not give the more generic name rte_512i_t or something? > # Currently, In every arch file, we are repeating the definition for > rte_xmm_t, Why not make, this generic definition > in common file. ie. rte_zmm_t or rte_512i_t definition in common > file(./lib/librte_eal/common/include/generic/rte_vect.h) > > I think there could be some arch specific thing that prevents it from being > generic. > > # Currently ./lib/librte_eal/common/include/generic/rte_vect.h has > defintion for rte_vXsY_t for vector representation, would that > be enough for public API? Do we need to new type? > > Definitions for rte_vXsY_tare almost the same as compiler's __m[128,256,512]i > apart from alignment. > Union types such as rte_zmm_t are very useful because of the ability to > access parts of a wide vector register with an arbitrary granularity. For > example, some old compiler don't support _mm512_set_epi8()/_mm512_set_epi16() > intrinsics, so accessing ".u8[]" of ".u16[]" solves the problem. Yes. We are on the same page. I think, the only difference in thought is, the x86 specific definition(rte_zmm_t) name should be something it needs to be reflected as internal or arch-specific. Earlier APIs such rte_lpm_lookupx4 has leaked the xmm_t definition to public API. To avoid that danger, please make rte_zmm_t as internal/arch-specific. Something __rte_x86_zmm_t or so that denotes it is not a public symbol.