Hi Andreas,

On deeper analysis of the stack trace I found no regression in the pager code. 
The symptoms appear to be from the final overflow output buffer. 
unix_cli_add_pending_output is designed to write to the output descriptor 
unless it's using a buffer. The buffer only ever gets used if writing to the 
descriptor fails; typically this would happen only if there was a signal 
(EAGAIN) or it wasn't able to deliver the whole request, e.g. because the 
socket send buffer is full or the tty can't keep up; these are rare. When this 
happens the undelivered bytes are copied to a buffer and an event system 
triggered to poll the descriptor for when it is able to deliver data again; 
until the buffer is emptied, it will continue to accumulate bytes.

Your stack trace suggests this is what is happening. 
unix_cli_add_pending_output would only try to resize a memory block if it was 
executing the line that appends to the buffer. It's asking for ~40MB 
(presumably the size of the accumulation so far), and that resize request 
bombs, apparently ungracefully. This is where it becomes interesting; this 
happens during the memcpy operation that copies the data to its new resized 
location. This implies VPP allocated the memory from the heap successfully and 
that ultimately the process ended by external action. VPP's memory manager did 
not complain.

You don't include the details of the signal that ended the process, only a 
stipulation that it was an OOM error.

Was it Linux that reported the OOM? In which case my inclination would be to 
look at the balance of memory usage across the whole system, not just VPP. 
VPP's default heap is 1GB, and add any hugepages allocated for packet buffers.

That said, it is a reasonable suggestion to cap how large this overflow buffer 
can grow to avoid unchecked memory bloat when large amounts of terminal output 
cannot be delivered, if only to protect VPP. However in this specific case I 
believe this would only stay the inevitable OOM for a short period of time.

Chris.

> -----Original Message-----
> From: vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> On Behalf Of Chris Luke
> Sent: Thursday, May 16, 2019 6:42
> To: Andreas Schultz <andreas.schu...@travelping.com>
> Cc: vpp-dev@lists.fd.io
> Subject: Re: [EXTERNAL] [vpp-dev] VPP OOM crash in CLI
> 
> > -----Original Message-----
> > From: Andreas Schultz <andreas.schu...@travelping.com> Am Mi., 15. Mai
> > 2019 um 19:31 Uhr schrieb Luke, Chris
> > <chris_l...@comcast.com>:
> > >
> > > The pager in the CLI retains output up to a certain amount but then
> > > gives
> > up and switches to pass-through after a certain number of lines
> > (default is 100000). If the output doesn't have newlines, or that
> > default has been altered, then it will try to use more memory.
> > >
> > > In this case it appears to die while trying to increase the buffer
> > > to ~40MB in
> > size, which is quite a lot; are these long lines that it is trying to 
> > display?
> >
> > That was the default FIB table output ('show ip fib' if I remember 
> > correctly).
> > Just with about 300000 entries. Same thing would probably happen on a
> > `show interfaces` command.
> 
> Okay, when I get a chance I'll look at the buffer size limit check in case of 
> a
> regression since it was written. The intended behavior is that it buffers
> output in memory so that you can page through it, but once it hits that limit 
> it
> gives up and dumps both the current contents of the buffer, and any future
> content, directly to the output. At the time I chose to not redirect it to a 
> file,
> or similar, to avoid introducing more syscalls than necessary at runtime.
> 
> Chris.
> 
> >
> > Andreas
> >
> > >
> > > Chris.
> > >
> > > -----Original Message-----
> > > From: vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> On Behalf Of Andreas
> > > Schultz
> > > Sent: Wednesday, May 15, 2019 12:39 PM
> > > To: vpp-dev@lists.fd.io
> > > Subject: [EXTERNAL] [vpp-dev] VPP OOM crash in CLI
> > >
> > > Hi,
> > >
> > > It seems VPPs CLI is not very good at dealing with large FIBs or
> > > lots of
> > interfaces. I know the CLI is a debug tool only, but IMHO it should
> > not crash VPP that easily.
> > > On a fib with 300k entries, the pager does not work and I get a OOM
> crash:
> > >
> > > (gdb) bt
> > > #0  clib_mov16 (src=<optimized out>, dst=<optimized out>) at
> > > /usr/src/vpp/src/vppinfra/memcpy_sse3.h:60
> > > #1  clib_mov32 (src=<optimized out>, dst=<optimized out>) at
> > > /usr/src/vpp/src/vppinfra/memcpy_sse3.h:67
> > > #2  clib_mov64 (src=<optimized out>, dst=<optimized out>) at
> > > /usr/src/vpp/src/vppinfra/memcpy_sse3.h:73
> > > #3  clib_mov128 (src=<optimized out>, dst=<optimized out>) at
> > > /usr/src/vpp/src/vppinfra/memcpy_sse3.h:81
> > > #4  clib_mov256 (src=0x7febd6d2ff30 "on184760 (p2p)\n[@0]: ipv4 via
> > > 0.0.0.0 upf_session184760: mtu:9000\npath:[184809] pl-index:184809
> > > ip4
> > > weight=1 pref=0 attached-nexthop:  oper-flags:resolved, cfg-
> > flags:attached,\n  10.43.104.28 upf_sessi"...,
> > >     dst=0x7febd97aff30 "on184760 (p2p)\n[@0]: ipv4 via 0.0.0.0
> > > upf_session184760: mtu:9000\npath:[184809] pl-index:184809 ip4
> > > weight=1 pref=0 attached-nexthop:  oper-flags:resolved, cfg-
> > flags:attached,\n  10.43.104.28 upf_sessi"...)
> > >     at /usr/src/vpp/src/vppinfra/memcpy_sse3.h:88
> > > #5  clib_memcpy_fast (n=40232024, src=0x7febd6d2ff30,
> > > dst=0x7febd97aff30) at /usr/src/vpp/src/vppinfra/memcpy_sse3.h:325
> > > #6  vec_resize_allocate_memory (v=v@entry=0x7febd4b9c01c,
> > > length_increment=length_increment@entry=201, data_bytes=40232076,
> > > header_bytes=<optimized out>, header_bytes@entry=0,
> > > data_align=data_align@entry=8) at /usr/src/vpp/src/vppinfra/vec.c:95
> > > #7  0x00007ffff6ae4233 in _vec_resize_inline (data_align=<optimized
> > > out>, header_bytes=<optimized out>, data_bytes=<optimized out>,
> > > length_increment=<optimized out>, v=<optimized out>) at
> > > /usr/src/vpp/src/vppinfra/vec.h:147
> > > #8  unix_cli_add_pending_output (uf=0x7ff2704bfe2c,
> > >     buffer=0x7fffbabeb96c "path:[209906] pl-index:209906 ip4
> > > weight=1
> > > pref=0 attached-nexthop:  oper-flags:resolved, cfg-flags:attached,\n
> > > 10.12.107.171 upf_session209858 (p2p)\n[@0]: ipv4 via 0.0.0.0
> > > upf_session209858: mtu:9000"..., buffer_bytes=201, cf=<optimized out>)
> > >     at /usr/src/vpp/src/vlib/unix/cli.c:544
> > > #9  0x00007ffff6ae5cb7 in unix_vlib_cli_output_raw
> > > (cf=cf@entry=0x7fffb93dc69c, uf=uf@entry=0x7ff2704bfe2c,
> > > buffer=<optimized out>, buffer_bytes=<optimized out>) at
> > > /usr/src/vpp/src/vlib/unix/cli.c:654
> > > #10 0x00007ffff6ae6475 in unix_vlib_cli_output_raw
> > > (buffer_bytes=<optimized out>, buffer=<optimized out>,
> > > uf=0x7ff2704bfe2c, cf=0x7fffb93dc69c) at
> > > /usr/src/vpp/src/vlib/unix/cli.c:620
> > > #11 unix_vlib_cli_output_cooked (cf=0x7fffb93dc69c, uf=0x7ff2704bfe2c,
> > >     buffer=0x7fffbabeb96c "path:[209906] pl-index:209906 ip4
> > > weight=1
> > > pref=0 attached-nexthop:  oper-flags:resolved, cfg-flags:attached,\n
> > > 10.12.107.171 upf_session209858 (p2p)\n[@0]: ipv4 via 0.0.0.0
> > > upf_session209858: mtu:9000"..., buffer_bytes=201)
> > >     at /usr/src/vpp/src/vlib/unix/cli.c:687
> > > #12 0x00007ffff6a8c79b in vlib_cli_output
> > > (vm=vm@entry=0x7ffff6d06700 <vlib_global_main>,
> > > fmt=fmt@entry=0x7ffff7889987 "%U") at
> > > /usr/src/vpp/src/vlib/cli.c:742
> > > #13 0x00007ffff77ffe23 in show_fib_path_command (vm=0x7ffff6d06700
> > > <vlib_global_main>, input=<optimized out>, cmd=<optimized out>) at
> > > /usr/src/vpp/src/vnet/fib/fib_path.c:2737
> > > #14 0x00007ffff6a8caa6 in vlib_cli_dispatch_sub_commands
> > > (vm=vm@entry=0x7ffff6d06700 <vlib_global_main>,
> > > cm=cm@entry=0x7ffff6d06900 <vlib_global_main+512>,
> > > input=input@entry=0x7fffbac5bf60,
> parent_command_index=<optimized
> > > out>) at /usr/src/vpp/src/vlib/cli.c:607
> > > #15 0x00007ffff6a8d0e7 in vlib_cli_dispatch_sub_commands
> > > (vm=vm@entry=0x7ffff6d06700 <vlib_global_main>,
> > > cm=cm@entry=0x7ffff6d06900 <vlib_global_main+512>,
> > > input=input@entry=0x7fffbac5bf60,
> parent_command_index=<optimized
> > > out>) at /usr/src/vpp/src/vlib/cli.c:568
> > > #16 0x00007ffff6a8d0e7 in vlib_cli_dispatch_sub_commands
> > > (vm=vm@entry=0x7ffff6d06700 <vlib_global_main>,
> > > cm=cm@entry=0x7ffff6d06900 <vlib_global_main+512>,
> > > input=input@entry=0x7fffbac5bf60,
> > > parent_command_index=parent_command_index@entry=0) at
> > > /usr/src/vpp/src/vlib/cli.c:568
> > > #17 0x00007ffff6a8d3b4 in vlib_cli_input (vm=0x7ffff6d06700
> > > <vlib_global_main>, input=input@entry=0x7fffbac5bf60,
> > > function=function@entry=0x7ffff6ae6900 <unix_vlib_cli_output>,
> > > function_arg=function_arg@entry=0) at
> > > /usr/src/vpp/src/vlib/cli.c:707
> > > #18 0x00007ffff6ae84c6 in unix_cli_process_input (cm=0x7ffff6d07040
> > > <unix_cli_main>, cli_file_index=0) at
> > > /usr/src/vpp/src/vlib/unix/cli.c:2420
> > > #19 unix_cli_process (vm=0x7ffff6d06700 <vlib_global_main>,
> > > rt=0x7fffbac4b000, f=<optimized out>) at
> > > /usr/src/vpp/src/vlib/unix/cli.c:2536
> > > #20 0x00007ffff6aa4e06 in vlib_process_bootstrap (_a=<optimized
> > > out>) at /usr/src/vpp/src/vlib/main.c:1469
> > > #21 0x00007ffff65a5864 in clib_calljmp () from
> > > /usr/src/vpp/build-root/install-vpp-native/vpp/lib/libvppinfra.so.19
> > > .0
> > > 8
> > > #22 0x00007fffb95ffb00 in ?? ()
> > > #23 0x00007ffff6aaa971 in vlib_process_startup (f=0x0,
> > > p=0x7fffbac4b000, vm=0x7ffff6d06700 <vlib_global_main>) at
> > > /usr/src/vpp/src/vlib/main.c:1491
> > > #24 dispatch_process (vm=0x7ffff6d06700 <vlib_global_main>,
> > > p=0x7fffbac4b000, last_time_stamp=0, f=0x0) at
> > > /usr/src/vpp/src/vlib/main.c:1536
> > > #25 0x0000000000000000 in ?? ()
> > >
> > > Regards
> > > Andreas
> > > --
> > > --
> > > Dipl.-Inform. Andreas Schultz
> > >
> > > ----------------------- enabling your networks ----------------------
> > > Travelping GmbH                     Phone:  +49-391-81 90 99 0
> > > Roentgenstr. 13                     Fax:    +49-391-81 90 99 299
> > > 39108 Magdeburg                     Email:  i...@travelping.com
> > > GERMANY                             Web:    http://www.travelping.com
> > >
> > > Company Registration: Amtsgericht Stendal        Reg No.:   HRB 10578
> > > Geschaeftsfuehrer: Holger Winkelmann          VAT ID No.: DE236673780
> > > --------------------------------------------------------------------
> > > -
> >
> >
> >
> > --
> > --
> > Dipl.-Inform. Andreas Schultz
> >
> > ----------------------- enabling your networks ----------------------
> > Travelping GmbH                     Phone:  +49-391-81 90 99 0
> > Roentgenstr. 13                     Fax:    +49-391-81 90 99 299
> > 39108 Magdeburg                     Email:  i...@travelping.com
> > GERMANY                             Web:    http://www.travelping.com
> >
> > Company Registration: Amtsgericht Stendal        Reg No.:   HRB 10578
> > Geschaeftsfuehrer: Holger Winkelmann          VAT ID No.: DE236673780
> > ---------------------------------------------------------------------
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#13078): https://lists.fd.io/g/vpp-dev/message/13078
Mute This Topic: https://lists.fd.io/mt/31632267/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to