> > On Mon, Oct 5, 2020 at 9:44 PM Konstantin Ananyev > <konstantin.anan...@intel.com> wrote: > > > > These patch series introduce support of AVX512 specific classify > > implementation for ACL library. > > It adds two new algorithms: > > - RTE_ACL_CLASSIFY_AVX512X16 - can process up to 16 flows in parallel. > > It uses 256-bit width instructions/registers only > > (to avoid frequency level change). > > On my SKX box test-acl shows ~15-30% improvement > > (depending on rule-set and input burst size) > > when switching from AVX2 to AVX512X16 classify algorithms. > > - RTE_ACL_CLASSIFY_AVX512X32 - can process up to 32 flows in parallel. > > It uses 512-bit width instructions/registers and provides higher > > performance then AVX512X16, but can cause frequency level change. > > On my SKX box test-acl shows ~50-70% improvement > > (depending on rule-set and input burst size) > > when switching from AVX2 to AVX512X32 classify algorithms. > > ICX and CLX testing showed similar level of speedup. > > > > Current AVX512 classify implementation is only supported on x86_64. > > Note that this series introduce a formal ABI incompatibility > > The only API change I can see is in rte_acl_classify_alg() new error > code but I don't think we need an announcement for this. > As for ABI, we are breaking it in this release, so I see no pb.
Cool, I just wanted to underline that patch #3: https://patches.dpdk.org/patch/79786/ is a formal ABI breakage. > > > > with previous versions of ACL library. > > > > v2 -> v3: > > Fix checkpatch warnings > > Split AVX512 algorithm into two and deduplicate common code > > Patch 7 still references a RTE_MACHINE_CPUFLAG flag. > Can you rework now that those flags have been dropped? > Should be fixed in v4: https://patches.dpdk.org/project/dpdk/list/?series=12721 One more thing to mention - this series has a dependency on Vladimir's patch: https://patches.dpdk.org/patch/79310/ ("eal/x86: introduce AVX 512-bit type"), so CI/travis would still report an error. Thanks Konstantin