Dear Rubina, thank you for quickly checking it!
Judging by the logs the VPP quits, so I would say there should be a core file, could you check ? If you find it (doublecheck by the timestamps that it is indeed the fresh one), you can load it in gdb (using gdb 'path-to-vpp-binary' 'path-to-core') and then get the backtrace using 'bt', this will give more idea on what is going on. --a On 5/29/18, Rubina Bianchi <r_bian...@outlook.com> wrote: > Dear Andrew > > I tested your patch and my problem still exist, but my service status > changed and now there isn't any information about deadlock problem. Do you > have any idea about how I can provide you more information? > > root@MYRB:~# service vpp status > * vpp.service - vector packet processing engine > Loaded: loaded (/lib/systemd/system/vpp.service; disabled; vendor preset: > enabled) > Active: inactive (dead) > > May 29 09:27:06 MYRB /usr/bin/vpp[30805]: load_one_vat_plugin:67: Loaded > plugin: udp_ping_test_plugin.so > May 29 09:27:06 MYRB /usr/bin/vpp[30805]: load_one_vat_plugin:67: Loaded > plugin: stn_test_plugin.so > May 29 09:27:06 MYRB vpp[30805]: /usr/bin/vpp[30805]: dpdk: EAL init args: > -c 1ff -n 4 --huge-dir /run/vpp/hugepages --file-prefix vpp -w 0000:08:00.0 > -w 0000:08:00.1 -w 0000:08 > May 29 09:27:06 MYRB /usr/bin/vpp[30805]: dpdk: EAL init args: -c 1ff -n 4 > --huge-dir /run/vpp/hugepages --file-prefix vpp -w 0000:08:00.0 -w > 0000:08:00.1 -w 0000:08:00.2 -w 000 > May 29 09:27:07 MYRB vnet[30805]: dpdk_ipsec_process:1012: not enough DPDK > crypto resources, default to OpenSSL > May 29 09:27:13 MYRB vnet[30805]: unix_signal_handler:124: received signal > SIGCONT, PC 0x7fa535dfbac0 > May 29 09:27:13 MYRB vnet[30805]: received SIGTERM, exiting... > May 29 09:27:13 MYRB systemd[1]: Stopping vector packet processing > engine... > May 29 09:27:13 MYRB vnet[30805]: unix_signal_handler:124: received signal > SIGTERM, PC 0x7fa534121867 > May 29 09:27:13 MYRB systemd[1]: Stopped vector packet processing engine. > > > ________________________________ > From: Andrew 👽 Yourtchenko <ayour...@gmail.com> > Sent: Monday, May 28, 2018 5:58 PM > To: Rubina Bianchi > Cc: vpp-dev@lists.fd.io > Subject: Re: [vpp-dev] Rx stuck to 0 after a while > > Dear Rubina, > > Thanks for catching and reporting this! > > I suspect what might be happening is my recent change of using two > unidirectional sessions in bihash vs. the single one triggered a race, > whereby as the owning worker is deleting the session, > the non-owning worker is trying to update it. That would logically > explain the "BUG: .." line (since you don't change the interfaces nor > moving the traffic around, the 5 tuples should not collide), and as > well the later stop. > > To take care of this issue, I think I will split the deletion of the > session in two stages: > 1) deactivation of the bihash entries that steer the traffic > 2) freeing up the per-worker session structure > > and have a little pause time inbetween these two so that the > workers-in-progress could > finish updating the structures. > > The below gerrit is the first cut: > > https://gerrit.fd.io/r/#/c/12770/ > > It passes the make test right now but I did not kick its tires too > much yet, will do tomorrow. > > You can try this change out in your test setup as well and tell me how it > feels. > > --a > > On 5/28/18, Rubina Bianchi <r_bian...@outlook.com> wrote: >> Hi >> >> I run vpp v18.07-rc0~237-g525c9d0f with only 2 interface in stateful acl >> (permit+reflect) and generated sfr traffic using trex v2.27. My rx will >> become 0 after a short while, about 300 sec in my machine. Here is vpp >> status: >> >> root@MYRB:~# service vpp status >> * vpp.service - vector packet processing engine >> Loaded: loaded (/lib/systemd/system/vpp.service; disabled; vendor >> preset: >> enabled) >> Active: failed (Result: signal) since Mon 2018-05-28 11:35:03 +0130; >> 37s >> ago >> Process: 32838 ExecStopPost=/bin/rm -f /dev/shm/db /dev/shm/global_vm >> /dev/shm/vpe-api (code=exited, status=0/SUCCESS) >> Process: 31754 ExecStart=/usr/bin/vpp -c /etc/vpp/startup.conf >> (code=killed, signal=ABRT) >> Process: 31750 ExecStartPre=/sbin/modprobe uio_pci_generic >> (code=exited, >> status=0/SUCCESS) >> Process: 31747 ExecStartPre=/bin/rm -f /dev/shm/db /dev/shm/global_vm >> /dev/shm/vpe-api (code=exited, status=0/SUCCESS) >> Main PID: 31754 (code=killed, signal=ABRT) >> >> May 28 16:32:47 MYRB vnet[31754]: acl_fa_node_fn:210: BUG: session >> LSB16(sw_if_index) and 5-tuple collision! >> May 28 16:35:02 MYRB vnet[31754]: unix_signal_handler:124: received >> signal >> SIGCONT, PC 0x7f1fb591cac0 >> May 28 16:35:02 MYRB vnet[31754]: received SIGTERM, exiting... >> May 28 16:35:02 MYRB systemd[1]: Stopping vector packet processing >> engine... >> May 28 16:35:02 MYRB vnet[31754]: unix_signal_handler:124: received >> signal >> SIGTERM, PC 0x7f1fb3c40867 >> May 28 16:35:03 MYRB vpp[31754]: vlib_worker_thread_barrier_sync_int: >> worker >> thread deadlock >> May 28 16:35:03 MYRB systemd[1]: vpp.service: Main process exited, >> code=killed, status=6/ABRT >> May 28 16:35:03 MYRB systemd[1]: Stopped vector packet processing engine. >> May 28 16:35:03 MYRB systemd[1]: vpp.service: Unit entered failed state. >> May 28 16:35:03 MYRB systemd[1]: vpp.service: Failed with result >> 'signal'. >> >> I attach my vpp configs to this email. I also run this test with the same >> config and added 4 interface instead of two. But in this case nothing >> happened to vpp and it was functional for a long time. >> >> Thanks, >> RB >> > -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#9438): https://lists.fd.io/g/vpp-dev/message/9438 View All Messages In Topic (4): https://lists.fd.io/g/vpp-dev/topic/20397310 Mute This Topic: https://lists.fd.io/mt/20397310/21656 New Topic: https://lists.fd.io/g/vpp-dev/post Change Your Subscription: https://lists.fd.io/g/vpp-dev/editsub/21656 Group Home: https://lists.fd.io/g/vpp-dev Contact Group Owner: vpp-dev+ow...@lists.fd.io Terms of Service: https://lists.fd.io/static/tos Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub -=-=-=-=-=-=-=-=-=-=-=-