Dear Andrew My Trex config is uploaded; I also tested the scenario with your Trex config. The stability of vpp in your run is strange. When I run this scenario, vpp crashes in my DUT machine after about 200 second of running Trex. In this period I see #del sessions is 0 until session pool becomes full, after that session deletion starts. But its rate is lower than the one I see when I run vpp on single core.
Could you please check my configs once again for any misconfiguration? Is vpp or dpdk compatible or incompatible with any specified device? Thanks, Sincerely Sent from Outlook<http://aka.ms/weboutlook> ________________________________ From: Andrew 👽 Yourtchenko <ayour...@gmail.com> Sent: Monday, March 12, 2018 1:50 PM To: Rubina Bianchi Cc: vpp-dev@lists.fd.io Subject: Re: [vpp-dev] Freezing Session Deletion Operation Dear Rubina, I've tried the test locally using the data that you sent, here is the output from my trex after 10 minutes running: -Per port stats table ports | 0 | 1 ----------------------------------------------------------------------------------------- opackets | 312605970 | 312191927 obytes | 100919855857 | 174147108346 ipackets | 311329098 | 277120788 ibytes | 173666531289 | 76492053900 ierrors | 0 | 0 oerrors | 0 | 0 Tx Bw | 1.17 Gbps | 2.01 Gbps -Global stats enabled Cpu Utilization : 21.2 % 30.0 Gb/core Platform_factor : 1.0 Total-Tx : 3.18 Gbps Total-Rx : 2.89 Gbps Total-PPS : 901.93 Kpps Total-CPS : 13.52 Kcps Expected-PPS : 901.92 Kpps Expected-CPS : 13.53 Kcps Expected-BPS : 3.18 Gbps Active-flows : 8883 Clients : 255 Socket-util : 0.0553 % Open-flows : 9425526 Servers : 65535 Socket : 8883 Socket/Clients : 34.8 drop-rate : 0.00 bps current time : 702.8 sec test duration : 2897.2 sec So, in my setup worked I could not see the behavior you describe... But we have at least one more thing that may be different between our setups - is the trex config. Here is what mine looks like: - version: 2 interfaces: ['03:00.0', '03:00.1'] port_limit: 2 memory: dp_flows: 2000000 port_info: - ip: 1.1.1.1 default_gw: 1.1.1.2 - ip: 1.1.1.2 default_gw: 1.1.1.1 Could you send me your trex config to see if that might be the difference between our setups, so I could try it locally ? Thanks! --a On 3/12/18, Rubina Bianchi <r_bian...@outlook.com> wrote: > Hi Dear Andrew > > I repeated once again my scenarios with short timeouts and upload all > configs and outputs for your consideration. > I am clear about that session cleaner process doesn't work properly and my > Trex throughput stuck at 0. > Please repeat this scenario to verify this (Unfortunately vpp is just stable > for 200 second and after that vpp will be down). > > Thanks, > Sincerely > > Sent from Outlook<http://aka.ms/weboutlook> > ________________________________ > From: Andrew 👽 Yourtchenko <ayour...@gmail.com> > Sent: Sunday, March 11, 2018 3:48 PM > To: Rubina Bianchi > Cc: vpp-dev@lists.fd.io > Subject: Re: [vpp-dev] Freezing Session Deletion Operation > > Hi Rubina, > > I am assuming you are observing this both in single core and multicore > scenario ? > > Based on the outputs, this is what I think might be going on: > > I am seeing the total# of sessions is 1000000, and no TCP transient > sessions - thus the packets that require a a session are dropped. > > What is a bit peculiar, is that the session delete# per-worker are > non-zero, yet the the delete counters are zero. To me this indicates > there was a fair bit of transient sessions, which also then got > recycled by the TCP sessions properly established, before the idle > timeout has expired. > > And at the moment of taking the show command output the connection > cleaner activity has not yet kicked in - I do not see either any > session deleted by idle timeout nor its timer restarted. Which makes > me think that the time interval in which you are testing must be > relatively short... > > So, assuming the time between the start of the traffic and the time > you have 1m sessions is quite short, this is simply using up all of > the connection pool, a classic inherent resource management issue with > any stateful scenario. > > You can verify that the sessions delete and start building again if > you issue "clear acl-plugin sessions". > > Also, changing the session timeouts to more aggressive values (say, 10 > seconds), should kick off the aggressive connection cleaning, thus > should unlock this condition. Of course, shorter idle time means > potentially useful connections removed. (the commands are "set > acl-plugin session timeout <udp|tcp> idle <X>"). > > *if* neither of the above does not adequately describe what you are > seeing, the cleaner node > may for whatever reason ceases to kick in every half a second. > > To see the dynamics of conn cleaner node, you can use the debug command > "set acl-plugin session event-trace 1" before the start of the test. > This will produce the event trace, which you can view by "show > event-logger all" - this should give a reasonable idea about what the > cleaner node is up to. > > Please let me know. > > --a > > > > > > On 3/11/18, Rubina Bianchi <r_bian...@outlook.com> wrote: >> Hi, >> >> I am testing vpp_18.01.1-124~g302ef0b5 (commit: >> 696e0da1cde7e2bcda5a781f9ad286672de8dcbf) and vpp_18.04-rc0~322-g30787378 >> (commit: 30787378f49714319e75437b347b7027a949700d) using Trex with sfr >> scenario in one core and multicore state. >> After a while I saw session deletion rate decreases and vpp throughput >> becomes 0 bps. >> All configuration files and outputs are attached. >> >> Thanks, >> Sincerely >> >> Sent from Outlook<http://aka.ms/weboutlook> >> >
trex_config.yaml
Description: trex_config.yaml