> > v3 changes: > > Applied review comments from Thomas: > > - fix spelling errors reported by codespell. > > - split last patch into two: > > first to remove unused macros, > > second to add some comments about ACL internal layout. > > > > v2 changes: > > - When build with the compilers that don't support AVX2 instructions, > > make rte_acl_classify_avx2() do nothing and return an error. > > - Remove unneeded 'ifdef __AVX2__' in acl_run_avx2.*. > > - Reorder order of patches in the set, to keep RTE_LIBRTE_ACL_STANDALONE=y > > always buildable. > > > > This patch series contain several fixes and enhancements for ACL library. > > See complete list below. > > Two main changes that are externally visible: > > - Introduce new classify method: RTE_ACL_CLASSIFY_AVX2. > > It uses AVX2 instructions and 256 bit wide data types > > to perform internal trie traversal. > > That helps to increase classify() throughput. > > This method is selected as default one on CPUs that supports AVX2. > > - Introduce new field in the build config structure: max_size. > > It specifies maximum size that internal RT structure for given context > > can reach. > > The purpose of that is to allow user to decide about space/performance > > trade-off > > (faster classify() vs less space for RT internal structures) > > for each given set of rules. > > > > Konstantin Ananyev (18): > > fix fix compilation issues with RTE_LIBRTE_ACL_STANDALONE=y > > app/test: few small fixes fot test_acl.c > > librte_acl: make data_indexes long enough to survive idle transitions. > > librte_acl: remove build phase heuristsic with negative performance > > effect. > > librte_acl: fix a bug at build phase that can cause matches beeing > > overwirtten. > > librte_acl: introduce DFA nodes compression (group64) for identical > > entries. > > librte_acl: build/gen phase - simplify the way match nodes are > > allocated. > > librte_acl: make scalar RT code to be more similar to vector one. > > librte_acl: a bit of RT code deduplication. > > EAL: introduce rte_ymm and relatives in rte_common_vect.h. > > librte_acl: add AVX2 as new rte_acl_classify() method > > test-acl: add ability to manually select RT method. > > librte_acl: Remove search_sse_2 and relatives. > > libter_acl: move lo/hi dwords shuffle out from calc_addr > > libte_acl: make calc_addr a define to deduplicate the code. > > libte_acl: introduce max_size into rte_acl_config. > > libte_acl: remove unused macros. > > libte_acl: add some comments about ACL internal layout. > > > For the series > Acked-by: Neil Horman <nhorman at tuxdriver.com>
Applied Thanks for the big work -- Thomas