Hi Florin,
Regarding 3), I think main problem maybe in function vl_socket_client_recv_fd_msg called by vcl_session_app_add_segment_handler. Mutiple worker threads share the same scm->client_socket.fd, so B2 may receive the segment memfd belong to A1.
Regards,
Hanlin
On 11/22/2019 01:44,Florin Coras<fcoras.li...@gmail.com> wrote:
Hi Hanlin,As Jon pointed out, you may want to register with gerrit.You comments with respect to points 1) and 2) are spot on. I’ve updated the patch to fix them.Regarding 3), if I understood your scenario correctly, it should not happen. The ssvm infra forces applications to map segments at fixed addresses. That is, for the scenario you’re describing lower, if B2 is processed first, ssvm_slave_init_memfd will map the segment at A2. Note how we first map the segment to read the shared header (sh) and then use sh->ssvm_va (which should be A2) to remap the segment at a fixed virtual address (va).Regards,FlorinOn Nov 21, 2019, at 2:49 AM, wanghanlin <wanghan...@corp.netease.com> wrote:-=-=-=-=-=-=-=-=-=-=-=-Hi Florin,I have applied the patch, and found some problems in my case. I have not right to post it in gerrit, so I post here.1)evt->event_type should be set with SESSION_CTRL_EVT_APP_DEL_SEGMENT rather than SESSION_CTRL_EVT_APP_ADD_SEGMENT. File: src/vnet/session/session_api.c, Line: 561, Function:mq_send_del_segment_cb2)session_send_fds may been called in the end of function mq_send_add_segment_cb, otherwise lock of app_mq can't been free here.File: src/vnet/session/session_api.c, Line: 519, Function:mq_send_add_segment_cb3) When vcl_segment_attach called in each worker thread, then ssvm_slave_init_memfd can been called in each worker thread and then ssvm_slave_init_memfd map address sequentially through map segment once in advance. It's OK in only one thread, but maybe wrong in multiple worker threads. Suppose following scene: VPP allocate segment at address A1 and notify worker thread B1 to expect B1 also map segment at address A1, and simultaneously VPP allocate segment at address A2 and notify worker thread B2 to expect B2 map segment at address A2. If B2 first process notify message, then ssvm_slave_init_memfd may map segment at address A1. Maybe VPP can add segment map address in notify message, and then worker thread just map segment at this address.Regards,HanlinHi Florin,VPP vsersion is v19.08.I'll apply this patch and check it. Thanks a lot!Regards,HanlinHi Hanlin,Just to make sure, are you running master or some older VPP?Regarding the issue you could be hitting lower, here’s [1] a patch that I have not yet pushed for merging because it leads to api changes for applications that directly use the session layer application interface instead of vcl. I haven’t tested it extensively, but the goal with it is to signal segment allocation/deallocation over the mq instead of the binary api.Finally, I’ve never tested LDP with Envoy, so not sure if that works properly. There’s ongoing work to integrate Envoy with VCL, so you may want to get in touch with the authors.Regards,FlorinOn Nov 15, 2019, at 2:26 AM, wanghanlin <wanghan...@corp.netease.com> wrote:-=-=-=-=-=-=-=-=-=-=-=-hi ALL,I accidentally got following crash stack when I used VCL with hoststack and memfd. But corresponding invalid rx_fifo address (0x2f42e2480) is valid in VPP process and also can be found in /proc/map. That is, shared memfd segment memory is not consistent between hoststack app and VPP.Generally, VPP allocate/dealloc the memfd segment and then notify hoststack app to attach/detach. But If just after VPP dealloc memfd segment and notify hoststack app, and then VPP allocate same memfd segment at once because of session connected, and then what happened now? Because hoststack app process dealloc message and connected message with diffrent threads, maybe rx_thread_fn just detach the memfd segment and not attach the same memfd segment, then unfortunately worker thread get the connected message.These are just my guess, maybe I misunderstand.(gdb) bt#0 0x00007f7cde21ffbf in raise () from /lib/x86_64-linux-gnu/libpthread.so.0#1 0x0000000001190a64 in Envoy::SignalAction::sigHandler (sig=11, info=<optimized out>, context=<optimized out>) at source/common/signal/signal_action.cc:73#2 <signal handler called>#3 0x00007f7cddc2e85e in vcl_session_connected_handler (wrk=0x7f7ccd4bad00, mp=0x224052f4a) at /home/wanghanlin/vpp-new/src/vcl/vppcom.c:471#4 0x00007f7cddc37fec in vcl_epoll_wait_handle_mq_event (wrk=0x7f7ccd4bad00, e=0x224052f48, events=0x395000c, num_ev=0x7f7cca49e5e8)at /home/wanghanlin/vpp-new/src/vcl/vppcom.c:2658#5 0x00007f7cddc3860d in vcl_epoll_wait_handle_mq (wrk=0x7f7ccd4bad00, mq=0x224042480, events=0x395000c, maxevents=63, wait_for_time=0, num_ev=0x7f7cca49e5e8)at /home/wanghanlin/vpp-new/src/vcl/vppcom.c:2762#6 0x00007f7cddc38c74 in vppcom_epoll_wait_eventfd (wrk=0x7f7ccd4bad00, events=0x395000c, maxevents=63, n_evts=0, wait_for_time=0)at /home/wanghanlin/vpp-new/src/vcl/vppcom.c:2823#7 0x00007f7cddc393a0 in vppcom_epoll_wait (vep_handle=33554435, events=0x395000c, maxevents=63, wait_for_time=0) at /home/wanghanlin/vpp-new/src/vcl/vppcom.c:2880#8 0x00007f7cddc5d659 in vls_epoll_wait (ep_vlsh=3, events=0x395000c, maxevents=63, wait_for_time=0) at /home/wanghanlin/vpp-new/src/vcl/vcl_locked.c:895#9 0x00007f7cdeb4c252 in ldp_epoll_pwait (epfd=67, events=0x3950000, maxevents=64, timeout=32, sigmask=0x0) at /home/wanghanlin/vpp-new/src/vcl/ldp.c:2334#10 0x00007f7cdeb4c334 in epoll_wait (epfd=67, events=0x3950000, maxevents=64, timeout=32) at /home/wanghanlin/vpp-new/src/vcl/ldp.c:2389#11 0x0000000000fc9458 in epoll_dispatch ()#12 0x0000000000fc363c in event_base_loop ()#13 0x0000000000c09b1c in Envoy::Server::WorkerImpl::threadRoutine (this=0x357d8c0, guard_dog=...) at source/server/worker_impl.cc:104#14 0x0000000001193485 in std::function<void ()>::operator()() const (this=0x7f7ccd4b8544)at /usr/lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/std_function.h:706#15 Envoy::Thread::ThreadImplPosix::ThreadImplPosix(std::function<void ()>)::$_0::operator()(void*) const (this=<optimized out>, arg=0x2f42e2480)at source/common/common/posix/thread_impl.cc:33#16 Envoy::Thread::ThreadImplPosix::ThreadImplPosix(std::function<void ()>)::$_0::__invoke(void*) (arg=0x2f42e2480) at source/common/common/posix/thread_impl.cc:32#17 0x00007f7cde2164a4 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0#18 0x00007f7cddf58d0f in clone () from /lib/x86_64-linux-gnu/libc.so.6(gdb) f 3#3 0x00007f7cddc2e85e in vcl_session_connected_handler (wrk=0x7f7ccd4bad00, mp=0x224052f4a) at /home/wanghanlin/vpp-new/src/vcl/vppcom.c:471471 rx_fifo->client_session_index = session_index;(gdb) p rx_fifo$1 = (svm_fifo_t *) 0x2f42e2480(gdb) p *rx_fifoCannot access memory at address 0x2f42e2480(gdb)Regards,Hanlin
Links: You receive all messages sent to this group.
View/Reply Online (#14604): https://lists.fd.io/g/vpp-dev/message/14604
Mute This Topic: https://lists.fd.io/mt/59126583/675152
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [fcoras.li...@gmail.com]
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#14654): https://lists.fd.io/g/vpp-dev/message/14654
Mute This Topic: https://lists.fd.io/mt/59126583/675152
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [fcoras.li...@gmail.com]
-=-=-=-=-=-=-=-=-=-=-=-
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group.
View/Reply Online (#14666): https://lists.fd.io/g/vpp-dev/message/14666 Mute This Topic: https://lists.fd.io/mt/59126583/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-