Hi Hanlin, 

Thanks to Dave, we can now have per thread binary api connections to vpp. I’ve 
updated the socket client and vcl to leverage this so, after [1] we have per 
vcl worker thread binary api sockets that are used to exchange fds. 

Let me know if you’re still hitting the issue. 

Regards,
Florin

[1] https://gerrit.fd.io/r/c/vpp/+/23687

> On Nov 22, 2019, at 10:30 AM, Florin Coras <fcoras.li...@gmail.com> wrote:
> 
> Hi Hanlin, 
> 
> Okay, that’s a different issue. The expectation is that each vcl worker has a 
> different binary api transport into vpp. This assumption holds for 
> applications with multiple process workers (like nginx) but is not completely 
> satisfied for applications with thread workers. 
> 
> Namely, for each vcl worker we connect over the socket api to vpp and 
> initialize the shared memory transport (so binary api messages are delivered 
> over shared memory instead of the socket). However, as you’ve noted, the 
> socket client is currently not multi-thread capable, consequently we have an 
> overlap of socket client fds between the workers. The first segment is 
> assigned properly but the subsequent ones will fail in this scenario. 
> 
> I wasn’t aware of this so we’ll have to either fix the socket binary api 
> client, for multi-threaded apps, or change the session layer to use different 
> fds for exchanging memfd fds. 
> 
> Regards, 
> Florin
> 
>> On Nov 21, 2019, at 11:47 PM, wanghanlin <wanghan...@corp.netease.com 
>> <mailto:wanghan...@corp.netease.com>> wrote:
>> 
>> Hi Florin,
>> Regarding 3), I think main problem maybe in function 
>> vl_socket_client_recv_fd_msg called by vcl_session_app_add_segment_handler.  
>> Mutiple worker threads share the same scm->client_socket.fd, so B2 may 
>> receive the segment memfd belong to A1.
>> 
>>  
>> Regards,
>> Hanlin
>> 
>>      
>> wanghanlin
>> 
>> wanghan...@corp.netease.com
>>  
>> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=wanghanlin&uid=wanghanlin%40corp.netease.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22wanghanlin%40corp.netease.com%22%5D&logoUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyeicon%2F209a2912f40f6683af56bb7caff1cb54.png>
>> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
>> On 11/22/2019 01:44,Florin Coras<fcoras.li...@gmail.com> 
>> <mailto:fcoras.li...@gmail.com> wrote: 
>> Hi Hanlin, 
>> 
>> As Jon pointed out, you may want to register with gerrit. 
>> 
>> You comments with respect to points 1) and 2) are spot on. I’ve updated the 
>> patch to fix them. 
>> 
>> Regarding 3), if I understood your scenario correctly, it should not happen. 
>> The ssvm infra forces applications to map segments at fixed addresses. That 
>> is, for the scenario you’re describing lower, if B2 is processed first, 
>> ssvm_slave_init_memfd will map the segment at A2. Note how we first map the 
>> segment to read the shared header (sh) and then use sh->ssvm_va (which 
>> should be A2) to remap the segment at a fixed virtual address (va). 
>> 
>> Regards,
>> Florin
>> 
>>> On Nov 21, 2019, at 2:49 AM, wanghanlin <wanghan...@corp.netease.com 
>>> <mailto:wanghan...@corp.netease.com>> wrote:
>>> 
>>> Hi Florin,
>>> I have applied the patch, and found some problems in my case.  I have not 
>>> right to post it in gerrit, so I post here.
>>> 1)evt->event_type should be set  with SESSION_CTRL_EVT_APP_DEL_SEGMENT 
>>> rather than SESSION_CTRL_EVT_APP_ADD_SEGMENT. File: 
>>> src/vnet/session/session_api.c, Line: 561, Function:mq_send_del_segment_cb
>>> 2)session_send_fds may been called in the end of function 
>>> mq_send_add_segment_cb, otherwise lock of app_mq can't been free here.File: 
>>> src/vnet/session/session_api.c, Line: 519, Function:mq_send_add_segment_cb 
>>> 3) When vcl_segment_attach called in each worker thread, then 
>>> ssvm_slave_init_memfd can been called in each worker thread and then 
>>> ssvm_slave_init_memfd map address sequentially through map segment once in 
>>> advance.  It's OK in only one thread, but maybe wrong in multiple worker 
>>> threads. Suppose following scene: VPP allocate segment at address A1 and 
>>> notify worker thread B1 to expect B1 also map segment at address A1,  and 
>>> simultaneously VPP allocate segment at address A2 and notify worker thread 
>>> B2 to expect B2 map segment at address A2. If B2 first process notify 
>>> message, then ssvm_slave_init_memfd may map segment at address A1. Maybe 
>>> VPP can add segment map address in notify message, and then worker thread 
>>> just map segment at this address. 
>>> 
>>> Regards,
>>> Hanlin
>>>     
>>> wanghanlin
>>> 
>>> wanghan...@corp.netease.com
>>>  
>>> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=wanghanlin&uid=wanghanlin%40corp.netease.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22wanghanlin%40corp.netease.com%22%5D&logoUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyeicon%2F209a2912f40f6683af56bb7caff1cb54.png>
>>> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
>>> On 11/19/2019 09:50,wanghanlin<wanghan...@corp.netease.com> 
>>> <mailto:wanghan...@corp.netease.com> wrote: 
>>> Hi  Florin,
>>> VPP vsersion is v19.08.
>>> I'll apply this patch and check it. Thanks a lot!
>>> 
>>> Regards,
>>> Hanlin
>>>     
>>> wanghanlin
>>> 
>>> wanghan...@corp.netease.com
>>>  
>>> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=wanghanlin&uid=wanghanlin%40corp.netease.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22wanghanlin%40corp.netease.com%22%5D&logoUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyeicon%2F209a2912f40f6683af56bb7caff1cb54.png>
>>> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
>>> On 11/16/2019 00:50,Florin Coras<fcoras.li...@gmail.com> 
>>> <mailto:fcoras.li...@gmail.com> wrote: 
>>> Hi Hanlin,
>>> 
>>> Just to make sure, are you running master or some older VPP?
>>> 
>>> Regarding the issue you could be hitting lower, here’s [1] a patch that I 
>>> have not yet pushed for merging because it leads to api changes for 
>>> applications that directly use the session layer application interface 
>>> instead of vcl. I haven’t tested it extensively, but the goal with it is to 
>>> signal segment allocation/deallocation over the mq instead of the binary 
>>> api.
>>> 
>>> Finally, I’ve never tested LDP with Envoy, so not sure if that works 
>>> properly. There’s ongoing work to integrate Envoy with VCL, so you may want 
>>> to get in touch with the authors. 
>>> 
>>> Regards,
>>> Florin
>>> 
>>> [1] https://gerrit.fd.io/r/c/vpp/+/21497 
>>> <https://gerrit.fd.io/r/c/vpp/+/21497>
>>> 
>>>> On Nov 15, 2019, at 2:26 AM, wanghanlin <wanghan...@corp.netease.com 
>>>> <mailto:wanghan...@corp.netease.com>> wrote:
>>>> 
>>>> hi ALL,
>>>> I accidentally got following crash stack when I used VCL with hoststack 
>>>> and memfd. But corresponding invalid rx_fifo address (0x2f42e2480) is 
>>>> valid in VPP process and also can be found in /proc/map. That is, shared 
>>>> memfd segment memory is not consistent between hoststack app and VPP.
>>>> Generally, VPP allocate/dealloc the memfd segment and then notify 
>>>> hoststack app to attach/detach. But If just after VPP dealloc memfd 
>>>> segment and notify hoststack app, and then VPP allocate same memfd segment 
>>>> at once because of session connected, and then what happened now? Because 
>>>> hoststack app process dealloc message and connected message with diffrent 
>>>> threads, maybe rx_thread_fn just detach the memfd segment and not attach 
>>>> the same memfd segment, then unfortunately worker thread get the connected 
>>>> message. 
>>>> 
>>>> These are just my guess, maybe I misunderstand.
>>>> 
>>>> (gdb) bt
>>>> #0  0x00007f7cde21ffbf in raise () from 
>>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>>> #1  0x0000000001190a64 in Envoy::SignalAction::sigHandler (sig=11, 
>>>> info=<optimized out>, context=<optimized out>) at 
>>>> source/common/signal/signal_action.cc:73 <http://signal_action.cc:73/>
>>>> #2  <signal handler called>
>>>> #3  0x00007f7cddc2e85e in vcl_session_connected_handler 
>>>> (wrk=0x7f7ccd4bad00, mp=0x224052f4a) at 
>>>> /home/wanghanlin/vpp-new/src/vcl/vppcom.c:471
>>>> #4  0x00007f7cddc37fec in vcl_epoll_wait_handle_mq_event 
>>>> (wrk=0x7f7ccd4bad00, e=0x224052f48, events=0x395000c, 
>>>> num_ev=0x7f7cca49e5e8)
>>>>     at /home/wanghanlin/vpp-new/src/vcl/vppcom.c:2658
>>>> #5  0x00007f7cddc3860d in vcl_epoll_wait_handle_mq (wrk=0x7f7ccd4bad00, 
>>>> mq=0x224042480, events=0x395000c, maxevents=63, wait_for_time=0, 
>>>> num_ev=0x7f7cca49e5e8)
>>>>     at /home/wanghanlin/vpp-new/src/vcl/vppcom.c:2762
>>>> #6  0x00007f7cddc38c74 in vppcom_epoll_wait_eventfd (wrk=0x7f7ccd4bad00, 
>>>> events=0x395000c, maxevents=63, n_evts=0, wait_for_time=0)
>>>>     at /home/wanghanlin/vpp-new/src/vcl/vppcom.c:2823
>>>> #7  0x00007f7cddc393a0 in vppcom_epoll_wait (vep_handle=33554435, 
>>>> events=0x395000c, maxevents=63, wait_for_time=0) at 
>>>> /home/wanghanlin/vpp-new/src/vcl/vppcom.c:2880
>>>> #8  0x00007f7cddc5d659 in vls_epoll_wait (ep_vlsh=3, events=0x395000c, 
>>>> maxevents=63, wait_for_time=0) at 
>>>> /home/wanghanlin/vpp-new/src/vcl/vcl_locked.c:895
>>>> #9  0x00007f7cdeb4c252 in ldp_epoll_pwait (epfd=67, events=0x3950000, 
>>>> maxevents=64, timeout=32, sigmask=0x0) at 
>>>> /home/wanghanlin/vpp-new/src/vcl/ldp.c:2334
>>>> #10 0x00007f7cdeb4c334 in epoll_wait (epfd=67, events=0x3950000, 
>>>> maxevents=64, timeout=32) at /home/wanghanlin/vpp-new/src/vcl/ldp.c:2389
>>>> #11 0x0000000000fc9458 in epoll_dispatch ()
>>>> #12 0x0000000000fc363c in event_base_loop ()
>>>> #13 0x0000000000c09b1c in Envoy::Server::WorkerImpl::threadRoutine 
>>>> (this=0x357d8c0, guard_dog=...) at source/server/worker_impl.cc:104 
>>>> <http://worker_impl.cc:104/>
>>>> #14 0x0000000001193485 in std::function<void ()>::operator()() const 
>>>> (this=0x7f7ccd4b8544)
>>>>     at 
>>>> /usr/lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/std_function.h:706
>>>> #15 Envoy::Thread::ThreadImplPosix::ThreadImplPosix(std::function<void 
>>>> ()>)::$_0::operator()(void*) const (this=<optimized out>, arg=0x2f42e2480)
>>>>     at source/common/common/posix/thread_impl.cc:33 
>>>> <http://thread_impl.cc:33/>
>>>> #16 Envoy::Thread::ThreadImplPosix::ThreadImplPosix(std::function<void 
>>>> ()>)::$_0::__invoke(void*) (arg=0x2f42e2480) at 
>>>> source/common/common/posix/thread_impl.cc:32 <http://thread_impl.cc:32/>
>>>> #17 0x00007f7cde2164a4 in start_thread () from 
>>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>>> #18 0x00007f7cddf58d0f in clone () from /lib/x86_64-linux-gnu/libc.so.6
>>>> (gdb) f 3
>>>> #3  0x00007f7cddc2e85e in vcl_session_connected_handler 
>>>> (wrk=0x7f7ccd4bad00, mp=0x224052f4a) at 
>>>> /home/wanghanlin/vpp-new/src/vcl/vppcom.c:471
>>>> 471       rx_fifo->client_session_index = session_index;
>>>> (gdb) p rx_fifo
>>>> $1 = (svm_fifo_t *) 0x2f42e2480
>>>> (gdb) p *rx_fifo
>>>> Cannot access memory at address 0x2f42e2480
>>>> (gdb)
>>>> 
>>>> 
>>>> Regards,
>>>> Hanlin
>>>>    
>>>> wanghanlin
>>>> 
>>>> wanghan...@corp.netease.com
>>>>  
>>>> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=wanghanlin&uid=wanghanlin%40corp.netease.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22wanghanlin%40corp.netease.com%22%5D&logoUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyeicon%2F209a2912f40f6683af56bb7caff1cb54.png>
>>>> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
>>>> -=-=-=-=-=-=-=-=-=-=-=-
>>>> Links: You receive all messages sent to this group.
>>>> 
>>>> View/Reply Online (#14604): https://lists.fd.io/g/vpp-dev/message/14604 
>>>> <https://lists.fd.io/g/vpp-dev/message/14604>
>>>> Mute This Topic: https://lists.fd.io/mt/59126583/675152 
>>>> <https://lists.fd.io/mt/59126583/675152>
>>>> Group Owner: vpp-dev+ow...@lists.fd.io <mailto:vpp-dev+ow...@lists.fd.io>
>>>> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub 
>>>> <https://lists.fd.io/g/vpp-dev/unsub>  [fcoras.li...@gmail.com 
>>>> <mailto:fcoras.li...@gmail.com>]
>>>> -=-=-=-=-=-=-=-=-=-=-=-
>>> 
>>> -=-=-=-=-=-=-=-=-=-=-=-
>>> Links: You receive all messages sent to this group.
>>> 
>>> View/Reply Online (#14654): https://lists.fd.io/g/vpp-dev/message/14654 
>>> <https://lists.fd.io/g/vpp-dev/message/14654>
>>> Mute This Topic: https://lists.fd.io/mt/59126583/675152 
>>> <https://lists.fd.io/mt/59126583/675152>
>>> Group Owner: vpp-dev+ow...@lists.fd.io <mailto:vpp-dev+ow...@lists.fd.io>
>>> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub 
>>> <https://lists.fd.io/g/vpp-dev/unsub>  [fcoras.li...@gmail.com 
>>> <mailto:fcoras.li...@gmail.com>]
>>> -=-=-=-=-=-=-=-=-=-=-=-
>> 
>> -=-=-=-=-=-=-=-=-=-=-=-
>> Links: You receive all messages sent to this group.
>> 
>> View/Reply Online (#14666): https://lists.fd.io/g/vpp-dev/message/14666 
>> <https://lists.fd.io/g/vpp-dev/message/14666>
>> Mute This Topic: https://lists.fd.io/mt/59126583/675152 
>> <https://lists.fd.io/mt/59126583/675152>
>> Group Owner: vpp-dev+ow...@lists.fd.io <mailto:vpp-dev+ow...@lists.fd.io>
>> Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub 
>> <https://lists.fd.io/g/vpp-dev/unsub>  [fcoras.li...@gmail.com 
>> <mailto:fcoras.li...@gmail.com>]
>> -=-=-=-=-=-=-=-=-=-=-=-
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#14872): https://lists.fd.io/g/vpp-dev/message/14872
Mute This Topic: https://lists.fd.io/mt/59126583/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to