Hi Govind, 1) According to Jenkins, this patch permits some of the packets that should be denied, hence JJB voted "-1".
2) If you suspect merely the prefetches are the issue, just commenting out the body of prefetch_session_entry() in the original code should turn it into a no-op that doesn't break anything. Hard to say anything else given the functionality is not correct. In general - ensure you run "EXTENDED_TESTS=y TEST=acl* make test" as a sanity check before extensive perf-tests. It's not a 100% guarantee but it does catch a few naughty cases. Also - take a look at f1cd92d8d9, which got about 30% improvement back in the day, and is the source of much of the trickiness in that node. --a On 5/27/20, Govindarajan Mohandoss <govindarajan.mohand...@arm.com> wrote: > Hi Andrew, > > While profiling the ACL plugin node using perf tool in ARM Neoverse > platform, Bihash related prefetches were shown as bottleneck. > > Performance improvement is seen in ARM N1, TX2 and Intel Skylake servers > after removing those prefetches. Testing is done with Ingress ACL/IPv4 > forwarding in both SF and SL modes. > > As the code change is common for Ingress/Egress ACL for both IPv4 and IPv6, > performance improvement is expected for those cases also. > > Following are the test results for Ingress ACL / IPv4 / 1 core / 64B @ MRR > in ARM N1, TX2 and Intel Skylake servers: > > > > Legend: > > ======= > > N1 - ARM Neoverse > > TX2 - ARM Thunder X2 > > SKX - Intel Skylake > > SL: % imp - Performance improvement in stateless mode > > SF: % imp - Performance improvement in stateful mode > > > > > > > SKX > N1 > TX2 > Num Rules > Matching Rules > SL: Avg % imp > SF: Avg % imp > SL: % imp > SF: % imp > SL: % imp > SF: % imp > 1 > 1 > 0.99 > 12.09 > 8.38 > 10.41 > 4.48 > 4.63 > 50 > 1 (50th) > 0.79 > 9.63 > 8.76 > 10.06 > 5.32 > 4.63 > 100 > 1 (100th) > 4.34 > 10.75 > 8.60 > 10.06 > 6.98 > 4.63 > 1000 > 1(1000th) > 4.18 > 13.06 > 8.61 > 11.14 > 6.17 > 5.58 > 100 > 100 > 3.70 > 11.70 > 6.65 > 14 > 2.82 > 6.53 > 1000 > 1000 > 1.84 > 15.96 > 5.52 > 27.72 > 4.72 > 8.69 > > > > > > Please find the patch here: https://gerrit.fd.io/r/c/vpp/+/27167 > > > > I ran per patch regression on ARM Taishan server in CSIT lab. Following are > the results for Stateless and Stateful modes: > > 1. perftest-3n-tsh acl_statelessAND1cAND64b: > > > https://jenkins.fd.io/job/vpp-csit-verify-perf-master-3n-tsh/23/consoleFull > > In the log, I can see the comparative numbers between parent and > current (my patch) for 45 test cases. > > I searched for "Difference of averages relative to parent" in the log - > 41/45 test cases have shown around 4% improvement with the patch. Rest of > the 4 test cases stayed neutral. > > > > 2. perftest-3n-tsh acl_statefulAND1cAND64b: > > https://jenkins.fd.io/job/vpp-csit-verify-perf-master-3n-tsh/25/ > > Performance improvement is seen in all 36 test cases. > > > > Please provide your comments. > > > > Thanks > > Govind > > >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16543): https://lists.fd.io/g/vpp-dev/message/16543 Mute This Topic: https://lists.fd.io/mt/74507621/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-