Hi All,
when we try to use ping cmd with multi woker thread, we found sometimes the
ping delay is nearly 1000ms eventhough the two site is directly linked to each
other.
we found in the function signal_ip46_icmp_replay_event, it used
vlib_process_signal_event to notify the ping_response msg.
but the packet receive thread is not the same with the cli thread, this may
cause the notify failed.
And in run_ping_ip46_address, it must depend on the timeout the get the
icmp_reply, so the delay value is mainly decided by the ping interval(1000ms).
//hi,yanzhang,you can try this patch.I fix it when there is one worker thread.
https://gerrit.fd.io/r/#/c/6688/
ping command does not work when there is woker thread (VPP-844)
王辉 wanghui
IT开发工程师 IT Development
Engineer
虚拟化南京四部/无线研究院/无线产品经营部 NIV Nanjing Dept. IV/Wireless Product R&D
Institute/Wireless Product Operation Division
南京
E: wang.hu...@zte.com.cn
www.zte.com.cn
原始邮件
发件人: <leiyanzh...@raydonetworks.com>
收件人: <vpp-dev@lists.fd.io>
日 期 :2017年06月21日 16:58
主 题 :[vpp-dev] 【vpp-dev】delay is error in ping with multi worker thread
Hi All,
when we try to use ping cmd with multi woker thread, we found sometimes the
ping delay is nearly 1000ms eventhough the two site is directly linked to each
other.
we found in the function signal_ip46_icmp_replay_event, it used
vlib_process_signal_event to notify the ping_response msg.
but the packet receive thread is not the same with the cli thread, this may
cause the notify failed.
And in run_ping_ip46_address, it must depend on the timeout the get the
icmp_reply, so the delay value is mainly decided by the ping interval(1000ms).
we have changed to notify the icmp_reply msg by use
vl_api_rpc_call_main_thread, and this will use rpc callback to notify the msg
int the main thread.
but after the change, we fould sometimes the delay is nearly 10ms.
and we found the main thread is always block in linux_epoll_input, where a
epoll_pwait is used.
and this will make sometimes the rpc callback function will be called with a
10ms delay.
with the ping fuction, we can record the pkt receive time to avoid the problem.
But we found vl_api_rpc_call_main_thread was used in someother places, for
examle bfd_rpc_update_session.
and in the callback function bfd_rpc_update_session_cb, it used
clib_cpu_time_now. and if the callback function can not be called immediately,
this will import 10ms delay. In some situation, this will make bfd check error.
Is our analysis is right, or we missed something?
our ping change is below:
typedef struct
{
u8 event_type
uword ping_run_index
f64 work_time
icmp4_echo_request_header_t icmp4_header
icmp6_echo_request_header_t icmp6_header
}ping_reply_event_arg_t
static void
set_ping_reply_rpc_callback (ping_reply_event_arg_t * a)
{
ping_main_t *pm = &ping_main
vlib_main_t *vm = vlib_get_main ()
ASSERT (os_get_cpu_number () == 0)
u8 event_type = a->event_type
u32 bi0_copy
ping_run_t *pr = vec_elt_at_index (pm->ping_runs, a->ping_run_index)
if (vlib_buffer_alloc (vm, &bi0_copy, 1) == 1)
{
void *dst = vlib_buffer_get_current (vlib_get_buffer (vm, bi0_copy))
if (PING_RESPONSE_IP4 == event_type)
{
clib_memcpy (dst, &(a->icmp4_header),
sizeof(icmp4_echo_request_header_t))
}
else
{
clib_memcpy (dst, &(a->icmp6_header),
sizeof(icmp6_echo_request_header_t))
}
}
f64 rtt = vlib_time_now (vm) - a->icmp4_header.icmp_echo.time_sent
vlib_process_signal_event (vm, pr->cli_process_id, event_type, bi0_copy)
}
/*
* If we can find the ping run by an ICMP ID, then we send the signal
* to the CLI process referenced by that ping run, alongside with
* a freshly made copy of the packet.
* I opted for a packet copy to keep the main packet processing path
* the same as for all the other nodes.
*
*/
static int
signal_ip46_icmp_reply_event (vlib_main_t * vm,
u8 event_type, vlib_buffer_t * b0)
{
ping_main_t *pm = &ping_main
u16 net_icmp_id = 0
ping_reply_event_arg_t args
args.event_type = event_type
switch (event_type)
{
case PING_RESPONSE_IP4:
{
icmp4_echo_request_header_t *h0 = vlib_buffer_get_current (b0)
net_icmp_id = h0->icmp_echo.id
clib_memcpy (&(args.icmp4_header), h0,
sizeof(icmp4_echo_request_header_t))
}
break
case PING_RESPONSE_IP6:
{
icmp6_echo_request_header_t *h0 = vlib_buffer_get_current (b0)
net_icmp_id = h0->icmp_echo.id
clib_memcpy (&(args.icmp6_header), h0,
sizeof(icmp6_echo_request_header_t))
}
}
uword *p = hash_get (pm->ping_run_by_icmp_id,
clib_net_to_host_u16 (net_icmp_id))
if (!p)
{
return 0
}
args.work_time = vlib_time_now (vm)
args.ping_run_index = p[0]
vl_api_rpc_call_main_thread (set_ping_reply_rpc_callback,
(u8 *) & args, sizeof (args))
return 1
}
_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev