On Tue, Oct 6, 2020 at 5:08 PM Konstantin Ananyev <konstantin.anan...@intel.com> wrote: > > These patch series introduce support of AVX512 specific classify > implementation for ACL library. > It adds two new algorithms: > - RTE_ACL_CLASSIFY_AVX512X16 - can process up to 16 flows in parallel. > It uses 256-bit width instructions/registers only > (to avoid frequency level change). > On my SKX box test-acl shows ~15-30% improvement > (depending on rule-set and input burst size) > when switching from AVX2 to AVX512X16 classify algorithms. > - RTE_ACL_CLASSIFY_AVX512X32 - can process up to 32 flows in parallel. > It uses 512-bit width instructions/registers and provides higher > performance then AVX512X16, but can cause frequency level change. > On my SKX box test-acl shows ~50-70% improvement > (depending on rule-set and input burst size) > when switching from AVX2 to AVX512X32 classify algorithms. > ICX and CLX testing showed similar level of speedup.
Series applied, thanks. -- David Marchand