Incrementing / random src/dst addr/port.... Thanks… Dave
From: Pragash Vijayaragavan [mailto:pxv3...@rit.edu] Sent: Monday, November 6, 2017 7:06 AM To: Dave Barach (dbarach) <dbar...@cisco.com> Cc: vpp-dev@lists.fd.io; John Marshall (jwm) <j...@cisco.com>; Neale Ranns (nranns) <nra...@cisco.com>; Minseok Kwon <mxk...@rit.edu> Subject: Re: multi-core multi-threading performance Hi Dave, Thanks for the mail a "show run" command shows dpdk-input process on 2 of the workers but the ip6-lookup process is running only on 1 worker. What config should be done to make all threads process traffic. This is for 4 workers and 1 main core. Pasted output : vpp# sh run Thread 0 vpp_main (lcore 1) Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call acl-plugin-fa-cleaner-process any wait 0 0 15 4.97e3 0.00 api-rx-from-ring active 0 0 79 1.07e5 0.00 cdp-process any wait 0 0 3 2.65e3 0.00 dpdk-process any wait 0 0 2 6.77e7 0.00 fib-walk any wait 0 0 7474 6.74e2 0.00 gmon-process time wait 0 0 1 4.24e3 0.00 ikev2-manager-process any wait 0 0 7 7.04e3 0.00 ip6-icmp-neighbor-discovery-ev any wait 0 0 7 4.67e3 0.00 lisp-retry-service any wait 0 0 3 7.21e3 0.00 unix-epoll-input polling 21655148 0 0 5.43e2 0.00 vpe-oam-process any wait 0 0 4 5.28e3 0.00 --------------- Thread 1 vpp_wk_0 (lcore 2) Time 7.5, average vectors/node 255.99, last 128 main loops 14.00 per node 256.00 vector rates in 4.1903e6, out 4.1903e6, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call FortyGigabitEthernet4/0/0-outp active 123334 31572992 0 6.58e0 255.99 FortyGigabitEthernet4/0/0-tx active 123334 31572992 0 7.20e1 255.99 dpdk-input polling 124347 31572992 0 5.49e1 253.91 ip6-input active 123334 31572992 0 2.28e1 255.99 ip6-load-balance active 123334 31572992 0 1.61e1 255.99 ip6-lookup active 123334 31572992 0 3.77e2 255.99 ip6-rewrite active 123334 31572992 0 2.02e1 255.99 --------------- Thread 2 vpp_wk_1 (lcore 3) Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call dpdk-input polling 83188682 0 0 1.11e2 0.00 --------------- Thread 3 vpp_wk_2 (lcore 18) Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call --------------- Thread 4 vpp_wk_3 (lcore 19) Time 7.5, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call Thanks, Pragash Vijayaragavan Grad Student at Rochester Institute of Technology email : pxv3...@rit.edu<mailto:pxv3...@rit.edu> ph : 585 764 4662 On Mon, Nov 6, 2017 at 6:47 AM, Dave Barach (dbarach) <dbar...@cisco.com<mailto:dbar...@cisco.com>> wrote: Have you verified that all of the worker threads are processing traffic? Sufficiently poor RSS statistics could mean - in the limit - that only one worker thread is processing traffic. Thanks… Dave From: Pragash Vijayaragavan [mailto:pxv3...@rit.edu<mailto:pxv3...@rit.edu>] Sent: Sunday, November 5, 2017 10:03 PM To: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> Cc: John Marshall (jwm) <j...@cisco.com<mailto:j...@cisco.com>>; Neale Ranns (nranns) <nra...@cisco.com<mailto:nra...@cisco.com>>; Dave Barach (dbarach) <dbar...@cisco.com<mailto:dbar...@cisco.com>>; Minseok Kwon <mxk...@rit.edu<mailto:mxk...@rit.edu>> Subject: multi-core multi-threading performance Hi , We are measuring performance of ip6 lookup in multi-core multi-worker environments and we don't see good scaling of performance when we keep increasing the number of cores/workers. We are just changing the startup.conf file to create more workers, rx-queues, sock-mem etc. Should we do anything else to see an increase in performance. Is there a limitation on the performance even if we increase the number of workers. Is it dependent on the number of hardware NICs we have, we only have 1 NIC to receive the traffic. TIA, Thanks, Pragash Vijayaragavan Grad Student at Rochester Institute of Technology email : pxv3...@rit.edu<mailto:pxv3...@rit.edu> ph : 585 764 4662<tel:(585)%20764-4662>
_______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev