As Andrew says, vppctl is not ideal for this purpose.

Your GDB trace looks like it was vppctl itself, and it's stuck waiting on 
connect() - which means the far end, VPP, did not accept() the connection. My 
immediate reaction would be to check netstat to see if there's lots of old unix 
sockets still connected to the CLI endpoint.

But digging deeper, we should note that ping is in some regards special in the 
way it handles the CLI session compared to other CLI comments, because it wants 
to solicit input to interrupt a long-running ping. All of the CLI interaction - 
input and output - is in the main thread and each ping runs synchronously in 
that thread, if I remember correctly. Did I read that you run pings to 50 
different destinations every 5 seconds? Are those 50 dispatched in parallel? I 
think you'll be hitting contention for the main thread if so; while it's 
pinging it's not accepting connections. If several of your destinations are not 
reachable then you'll be waiting on several ping timeouts before any accept() 
happens.

Chris.


-----Original Message-----
From: vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> On Behalf Of Andrew Yourtchenko
Sent: Thursday, April 8, 2021 11:18
To: yichanglui <yichan...@126.com>
Cc: vpp-dev@lists.fd.io
Subject: [EXTERNAL] Re: [vpp-dev] vpp hang on

Hi,

first, a general comment: the vpp ping was aimed only for quick 
connectivity/diagnostic checks, not for continuous monitoring; and probably 
especially not using vppctl - since that involves a lot of overhead.

There are a few options:

1) Before the commit 34716fae918750e4fc7a7da4b06e0dfbdef2d1c5, there was code 
that performed the monitoring of targets on an ongoing basis.
You could probably take a look at that code and build something that is a bit 
more scalable and is more lightweight - and then upstream it as a separate 
plugin, provided you are happy to become a maintainer.

2) worst case, have a daemon that connects to VPP over the API and issues the 
commands using the cli_inband; maybe even adding an API call to ping plugin, 
such that you did not have the parse the commands.

That said, we should narrow down the issue that you are observing with the CLI, 
since regardless of how many times you do vppctl there should be no problem. 
Even if your version 19.08 is no longer receiving fixes, it's worth 
understanding what the problem is.

Were you able to reproduce the issue in your lab environment ?

--a


On 4/8/21, yichanglui <yichan...@126.com> wrote:
> hi
>
> We are going to use "vppctl ping " to detect whether the device is
> reacehable, but we have some problem below :
>
>
>
>
> we have above 50 service to do "vppctl ping 100.100.0.x repeat 1
> table-id x " detect different devices,  each service exec this command
> per 5 second, after 3~4 days,  all vppctl ping service hang on,  we
> use gdb to debug , vppctl are hang on below point at
> src\vppinfra\socket.c: clib_socket_init
>
>
>
>
> ==================================
>
>
>
>
> 416  if (addr.sa.sa_family == PF_INET)
> (gdb)
> 419  if (s->flags & CLIB_SOCKET_F_IS_SERVER)
> (gdb)
> 497      if ((s->flags & CLIB_SOCKET_F_NON_BLOCKING_CONNECT)
> (gdb)
> 505      if (connect (s->fd, &addr.sa, addr_len) < 0
>
>
> ==================================
>
> it may cause some error on vpp socket  , when vppctl connect to vpp
> many times.
>
>
>
>
> our vpp version is below,
>
> =====================================
>
> Version:                  v19.08.1-229~g1517d5e72-dirty
> Compiled by:              root
> Compile host:             i-qvoo7mp3
> Compile date:             Fri Nov  1 20:53:55 CST 2019
> Compile location:         /root/ws/vpp
> Compiler:                 GCC 7.5.0
> Current PID:              4652
>
> =====================================
>
>
> Linux version is below:
>
>
> Linux version 4.15.0-112-generic (buildd@lcy01-amd64-027) (gcc version
> 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #113-Ubuntu SMP Thu Jul 9
> 23:41:39 UTC 2020
>
>
>
>
> On the other hander,  Is any other better method to do network health
> detection , like one-arm-echo bfd? because we may detect the device
> that not support bfd.
>
>
>
>
>
>
>
>
>
>
>
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19159): https://lists.fd.io/g/vpp-dev/message/19159
Mute This Topic: https://lists.fd.io/mt/81946650/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to