Hi Florin, The patch doesn't fix any of the issues with epoll/select.
The usecase is like below 1.Main thread calls epoll_create. 2.One of the non-main threads calls epoll_ctl. 3.And another non-main thread calls epoll_wait. All the 3 threads above operate on a single epoll fd. Is this usecase supported? How to register these non-main threads [2 and 3] as workers with vcl? I am a newbie to VPP, I have no idea about this. Can you give me some input on this? Would registering these non-main threads [2,3] as workers with vcl resolve my problem? Did you mean LDP doesn't support this kind of registration? Thanks, Sharath. On Sat 30 Mar, 2019, 1:27 AM Florin Coras, <fcoras.li...@gmail.com> wrote: > Just so I understand, does the patch not fix the epoll issues or does it > fix the issues but it doesn’t fix select, which apparently crashes in a > different way. > > Second, what is your usecase/app? Are you actually trying to share > epoll/select between multiple threads? That is, multiple threads might want > to call epoll_wait/select at the same time? That is not supported. The > implicit assumption is that only the dispatcher thread is to call the two > functions the rest of the threads do only io work. > > If all the threads must handle async communication via epoll/select, then > they should register themselves as workers with vcl and get their own epoll > fd. LDP does not support that. > > Florin > > On Mar 29, 2019, at 12:13 PM, Sharath Kumar < > sharathkumarboyanapa...@gmail.com> wrote: > > No, it doesn't work. > > Attaching the applications being used. > > "Select" also has similar kind of issue when called from non-main thread > > Thread 9 "nstack_select" received signal SIGSEGV, Segmentation fault. > [Switching to Thread 0x7fffd77fe700 (LWP 63170)] > 0x00007ffff4e1d032 in ldp_select_init_maps (original=0x7fffbc0008c0, > resultb=0x7fffe002e514, libcb=0x7fffe002e544, vclb=0x7fffe002e52c, nfds=34, > minbits=64, n_bytes=5, si_bits=0x7fffd77fdc20, > libc_bits=0x7fffd77fdc28) at > /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/ldp.c:601 > 601 clib_bitmap_validate (*vclb, minbits); > (gdb) bt > #0 0x00007ffff4e1d032 in ldp_select_init_maps (original=0x7fffbc0008c0, > resultb=0x7fffe002e514, libcb=0x7fffe002e544, vclb=0x7fffe002e52c, nfds=34, > minbits=64, n_bytes=5, si_bits=0x7fffd77fdc20, > libc_bits=0x7fffd77fdc28) at > /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/ldp.c:601 > #1 0x00007ffff4e1db47 in ldp_pselect (nfds=34, readfds=0x7fffbc0008c0, > writefds=0x7fffbc000cd0, exceptfds=0x7fffbc0010e0, timeout=0x7fffd77fdcb0, > sigmask=0x0) > at > /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/ldp.c:723 > #2 0x00007ffff4e1e5d5 in select (nfds=34, readfds=0x7fffbc0008c0, > writefds=0x7fffbc000cd0, exceptfds=0x7fffbc0010e0, timeout=0x7fffd77fdd20) > at > /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/ldp.c:857 > #3 0x00007ffff7b4c42a in nstack_select_thread (arg=0x0) at > /home/root1/sharath/2019/vpp_ver/19.04/dmm/src/nSocket/nstack/event/select/nstack_select.c:651 > #4 0x00007ffff78ed6ba in start_thread (arg=0x7fffd77fe700) at > pthread_create.c:333 > #5 0x00007ffff741b41d in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 > > > Before https://gerrit.fd.io/r/#/c/18597/ I have tried to fix the issue. > > The below changes fixed epoll_wait and epoll_ctl issues for me.[doesn't > include the changes of https://gerrit.fd.io/r/#/c/18597/] > > diff --git a/src/vcl/vcl_locked.c b/src/vcl/vcl_locked.c > index fb19b5d..e6c891b 100644 > --- a/src/vcl/vcl_locked.c > +++ b/src/vcl/vcl_locked.c > @@ -564,7 +564,10 @@ vls_attr (vls_handle_t vlsh, uint32_t op, void > *buffer, uint32_t * buflen) > > if (!(vls = vls_get_w_dlock (vlsh))) > return VPPCOM_EBADFD; > + > + vls_mt_guard (0, VLS_MT_OP_XPOLL); > rv = vppcom_session_attr (vls_to_sh_tu (vls), op, buffer, buflen); > + vls_mt_unguard (); > vls_get_and_unlock (vlsh); > return rv; > } > @@ -773,8 +776,10 @@ vls_epoll_ctl (vls_handle_t ep_vlsh, int op, > vls_handle_t vlsh, > vls_table_rlock (); > ep_vls = vls_get_and_lock (ep_vlsh); > vls = vls_get_and_lock (vlsh); > + vls_mt_guard (0, VLS_MT_OP_XPOLL); > ep_sh = vls_to_sh (ep_vls); > sh = vls_to_sh (vls); > + vls_mt_unguard (); > > if (PREDICT_FALSE (!vlsl->epoll_mp_check)) > vls_epoll_ctl_mp_checks (vls, op); > > Thanks, > Sharath. > > On Fri, Mar 29, 2019 at 9:15 PM Florin Coras <fcoras.li...@gmail.com> > wrote: > >> Interesting. What application are you running and does this [1] fix the >> issue for you? >> >> In short, many of vls’ apis check if the call is coming in on a new >> pthread and program vcl accordingly if yes. The patch makes sure vls_attr >> does that as well. >> >> Thanks, >> Florin >> >> [1] https://gerrit.fd.io/r/#/c/18597/ >> >> On Mar 29, 2019, at 4:29 AM, Dave Barach via Lists.Fd.Io >> <http://lists.fd.io/> <dbarach=cisco....@lists.fd.io> wrote: >> >> For whatever reason, the vls layer received an event notification which >> didn’t end well. vcl_worker_get (wrk_index=4294967295) [aka 0xFFFFFFFF] >> will never work. >> >> I’ll let Florin comment further. He’s in the PDT time zone, so don’t >> expect to hear from him for a few hours. >> >> D. >> >> *From:* vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> *On Behalf Of *sharath >> kumar >> *Sent:* Friday, March 29, 2019 12:18 AM >> *To:* vpp-dev@lists.fd.io; csit-...@lists.fd.io >> *Subject:* [vpp-dev] multi-threaded application, "epoll_wait" and >> "epoll_ctl" have "received signal SIGABRT, Aborted". >> >> Hello all, >> >> I am a newbie to VPP. >> >> I am trying to run VPP with a multi-threaded application. >> "recv" works fine from non-main threads, >> whereas "epoll_wait" and "epoll_ctl" have "received signal SIGABRT, >> Aborted". >> >> Is this a known issue? >> Or am I doing something wrong? >> >> Attaching backtrace for "epoll_wait" and "epoll_ctl" >> >> Thread 9 "dmm_vcl_epoll" received signal SIGABRT, Aborted. >> [Switching to Thread 0x7fffd67fe700 (LWP 56234)] >> 0x00007ffff7349428 in __GI_raise (sig=sig@entry=6) at >> ../sysdeps/unix/sysv/linux/raise.c:54 >> 54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. >> (gdb) bt >> #0 0x00007ffff7349428 in __GI_raise (sig=sig@entry=6) at >> ../sysdeps/unix/sysv/linux/raise.c:54 >> #1 0x00007ffff734b02a in __GI_abort () at abort.c:89 >> #2 0x00007ffff496d873 in os_panic () at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vppinfra/unix-misc.c:176 >> #3 0x00007ffff48ce42c in debugger () at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vppinfra/error.c:84 >> #4 0x00007ffff48ce864 in _clib_error (how_to_die=2, function_name=0x0, >> line_number=0, fmt=0x7ffff4bfe0e0 "%s:%d (%s) assertion `%s' fails") >> at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vppinfra/error.c:143 >> #5 0x00007ffff4bcca7d in vcl_worker_get (wrk_index=4294967295) at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/vcl_private.h:540 >> #6 0x00007ffff4bccabe in vcl_worker_get_current () at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/vcl_private.h:554 >> #7 0x00007ffff4bd7c49 in vppcom_session_attr (session_handle=4278190080, >> op=6, buffer=0x0, buflen=0x0) at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/vppcom.c:2606 >> #8 0x00007ffff4bfc7fd in vls_attr (vlsh=0, op=6, buffer=0x0, buflen=0x0) >> at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/vcl_locked.c:569 >> #9 0x00007ffff4e21736 in ldp_epoll_pwait (epfd=32, >> events=0x7fffd67fad20, maxevents=1024, timeout=100, sigmask=0x0) at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/ldp.c:2203 >> #10 0x00007ffff4e21948 in epoll_wait (epfd=32, events=0x7fffd67fad20, >> maxevents=1024, timeout=100) at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/ldp.c:2257 >> #11 0x00007ffff4e13041 in dmm_vcl_epoll_thread (arg=0x0) at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/dmm_vcl_adpt.c:75 >> #12 0x00007ffff78ed6ba in start_thread (arg=0x7fffd67fe700) at >> pthread_create.c:333 >> #13 0x00007ffff741b41d in clone () at >> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 >> >> >> >> >> Thread 11 "vs_epoll" received signal SIGABRT, Aborted. >> 0x00007ffff7349428 in __GI_raise (sig=sig@entry=6) at >> ../sysdeps/unix/sysv/linux/raise.c:54 >> 54 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. >> (gdb) bt >> #0 0x00007ffff7349428 in __GI_raise (sig=sig@entry=6) at >> ../sysdeps/unix/sysv/linux/raise.c:54 >> #1 0x00007ffff734b02a in __GI_abort () at abort.c:89 >> #2 0x00007ffff496d873 in os_panic () at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vppinfra/unix-misc.c:176 >> #3 0x00007ffff48ce42c in debugger () at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vppinfra/error.c:84 >> #4 0x00007ffff48ce864 in _clib_error (how_to_die=2, function_name=0x0, >> line_number=0, fmt=0x7ffff4bfe1a0 "%s:%d (%s) assertion `%s' fails") >> at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vppinfra/error.c:143 >> #5 0x00007ffff4bcca7d in vcl_worker_get (wrk_index=4294967295) at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/vcl_private.h:540 >> #6 0x00007ffff4bccabe in vcl_worker_get_current () at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/vcl_private.h:554 >> #7 0x00007ffff4bd597a in vppcom_epoll_ctl (vep_handle=4278190080, op=1, >> session_handle=4278190082, event=0x7fffd4dfb3b0) >> at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/vppcom.c:2152 >> #8 0x00007ffff4bfd061 in vls_epoll_ctl (ep_vlsh=0, op=1, vlsh=2, >> event=0x7fffd4dfb3b0) at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/vcl_locked.c:787 >> #9 0x00007ffff4e213b6 in epoll_ctl (epfd=32, op=1, fd=34, >> event=0x7fffd4dfb3b0) at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/ldp.c:2118 >> #10 0x00007ffff4e12f88 in vpphs_ep_ctl_ops (epFD=-1, proFD=34, ctl_ops=0, >> events=0x7fffd5190078, pdata=0x7fffd53f01d0) >> at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/stacks/vpp/vpp/src/vcl/dmm_vcl_adpt.c:48 >> #11 0x00007ffff7b4d502 in nsep_epctl_triggle (epi=0x7fffd5190018, >> info=0x7fffd53f01d0, triggle_ops=0) at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/src/nSocket/nstack/event/epoll/nstack_eventpoll.c:134 >> #12 0x00007ffff7b4de31 in nsep_insert_node (ep=0x7fffd50bffa8, >> event=0x7fffd4dfb5a0, fdInfo=0x7fffd53f01d0) >> at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/src/nSocket/nstack/event/epoll/nstack_eventpoll.c:250 >> #13 0x00007ffff7b4e480 in nsep_epctl_add (ep=0x7fffd50bffa8, fd=22, >> events=0x7fffd4dfb5a0) at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/src/nSocket/nstack/event/epoll/nstack_eventpoll.c:294 >> #14 0x00007ffff7b44db0 in nstack_epoll_ctl (epfd=21, op=1, fd=22, >> event=0x7fffd4dfb630) at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/src/nSocket/nstack/nstack_socket.c:2499 >> #15 0x0000000000401e65 in process_server_msg_thread (pArgv=<optimized >> out>) at >> /home/root1/sharath/2019/vpp_ver/19.04/dmm/app_example/perf-test/multi_tcp_epoll_app_Ser.c:369 >> #16 0x00007ffff78ed6ba in start_thread (arg=0x7fffd4dff700) at >> pthread_create.c:333 >> #17 0x00007ffff741b41d in clone () at >> ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 >> >> Thanks and Regards, >> Sharath. >> -=-=-=-=-=-=-=-=-=-=-=- >> Links: You receive all messages sent to this group. >> >> View/Reply Online (#12665): https://lists.fd.io/g/vpp-dev/message/12665 >> Mute This Topic: https://lists.fd.io/mt/30819724/675152 >> Group Owner: vpp-dev+ow...@lists.fd.io >> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [fcoras.li...@gmail.com >> ] >> -=-=-=-=-=-=-=-=-=-=-=- >> >> >> <multi_tcp_epoll_app_Ser.c><multi_tcp_select_app_Ser.c> > > >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#12676): https://lists.fd.io/g/vpp-dev/message/12676 Mute This Topic: https://lists.fd.io/mt/30819724/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-