Thanks Benoit. We are spending some time on custom plugin implementation. we will get back to you whenever he have any update. but this custom plugin runs fine on the 2005 version.
Thanks Vipin A On Tue, Dec 14, 2021 at 3:10 PM Benoit Ganne (bganne) <bga...@cisco.com> wrote: > The abort() itself is because you are out of heap: > https://git.fd.io/vpp/tree/src/vppinfra/mem.h?h=stable/2106#n243 > As you pointed out this is caused by the ridiculous size of the pending > vector. > All this smells corruption of the pending frames: do you enqueue the same > frame several times? > I see you have a private plugin, it would be good to see if you can > reproduce the issue with only upstream VPP (no private code). > > Best > ben > > > -----Original Message----- > > From: vpp-dev@lists.fd.io <vpp-dev@lists.fd.io> On Behalf Of vipin > > allawadhi > > Sent: mardi 14 décembre 2021 10:20 > > To: vpp-dev@lists.fd.io > > Subject: [vpp-dev]crash in vec_resize_allocate_memory > > > > Hello VPP experts, > > > > fdio version: 2106 > > > > We are seeing the following crash with this version. with the earlier > > version (we were using fdio 2005), we don't see any problem. > > Have you seen a similar issue earlier? > > Any idea what could be the root cause based on the information given > > below? > > > > (gdb) bt > > > > #0 0x00007f3f5bf872a2 in raise () from /lib64/libc.so.6 > > #1 0x00007f3f5bf708a4 in abort () from /lib64/libc.so.6 > > #2 0x0000563169f469a0 in os_panic () at /usr/src/debug/vpp-21.06.0- > > 3~g50650da54_dirty.x86_64/src/vpp/vnet/main.c:453 > > #3 0x00007f3f5c334597 in clib_mem_alloc_aligned_at_offset > > (size=<optimized out>, align=8, align_offset=<optimized out>, > > os_out_of_memory_on_failure=1) at /usr/src/debug/vpp-21.06.0- > > 3~g50650da54_dirty.x86_64/src/vppinfra/mem.h:243 > > #4 vec_resize_allocate_memory (v=<optimized out>, v@entry > =0x7f3f4ef613e0, > > length_increment=<optimized out>, length_increment@entry=1, > > data_bytes=167075112, header_bytes=8, header_bytes@entry=0, > > data_align=data_align@entry=8, > > numa_id=numa_id@entry=255) at /usr/src/debug/vpp-21.06.0- > > 3~g50650da54_dirty.x86_64/src/vppinfra/vec.c:111 > > #5 0x00007f3f5c4025cb in _vec_resize_inline (v=0x7f3f4ef613e0, > > length_increment=1, data_bytes=<optimized out>, header_bytes=0, > > data_align=8, numa_id=255) > > at /usr/src/debug/vpp-21.06.0- > > 3~g50650da54_dirty.x86_64/src/vppinfra/vec.h:170 > > #6 vlib_put_next_frame (vm=<optimized out>, vm@entry=0x7f3f30239780, > > r=r@entry=0x7f3f30d1b180, next_index=next_index@entry=2, > > n_vectors_left=<optimized out>) > > at /usr/src/debug/vpp-21.06.0- > > 3~g50650da54_dirty.x86_64/src/vlib/main.c:543 > > #7 0x00007f3f5c5a61cd in enqueue_one (vm=0x7f3f30239780, > > node=0x7f3f30d1b180, used_elt_bmp=0x7f3bf9bf8f20, next_index=<optimized > > out>, buffers=0x7f3f30562090, nexts=0x7f3bf9bf9420, n_buffers=<optimized > > out>, > > n_left=<optimized out>, tmp=0x7f3bf9bf8f60) at /usr/src/debug/vpp- > > 21.06.0-3~g50650da54_dirty.x86_64/src/vlib/buffer_funcs.c:105 > > #8 vlib_buffer_enqueue_to_next_fn_skx (vm=0x7f3f30239780, > > node=0x7f3f30d1b180, buffers=0x7f3f30562090, nexts=<optimized out>, > > count=<optimized out>) > > at /usr/src/debug/vpp-21.06.0- > > 3~g50650da54_dirty.x86_64/src/vlib/buffer_funcs.c:153 > > #9 0x00007f3edac5bcaf in vlib_buffer_enqueue_to_next (count=<optimized > > out>, nexts=0x7f3bf9bf9420, buffers=<optimized out>, node=0x7f3f30d1b180, > > vm=0x7f3f30239780) > > at /usr/cna/bld-dataplane_base/base/cni-infra- > > dataplane/fdio/src/fdio.2106/build-root/install-vpp_debug- > > native/vpp/include/vlib/buffer_node.h:344 > > #10 custome_node_input_inline (is_trace=<optimized out>, frame=<optimized > > out>, node=<optimized out>, p_vlib_main=<optimized out>) > > at /src/cna/.build/dbg/external-package/fdio/src/fdio.2106/src/an- > > plugins/custom-plugin/custom_input_node.c > > #11 custom_node_input_fn (vm=0x7f3f30239780, node=<optimized out>, > > frame=0x7f3f30562080) > > at /src/cna/.build/dbg/external-package/fdio/src/fdio.2106/src/an- > > plugins/custom-plugin/custom_input_node.c > > #12 0x00007f3f5c405427 in dispatch_node (vm=0x7f3f30239780, > > node=0x7f3f30d1b180, type=VLIB_NODE_TYPE_INTERNAL, > > dispatch_state=VLIB_NODE_STATE_POLLING, frame=<optimized out>, > > last_time_stamp=<optimized out>) > > at /usr/src/debug/vpp-21.06.0- > > 3~g50650da54_dirty.x86_64/src/vlib/main.c:1058 > > #13 dispatch_pending_node (vm=0x7f3f30239780, > > pending_frame_index=10442192, last_time_stamp=<optimized out>) at > > /usr/src/debug/vpp-21.06.0-3~g50650da54_dirty.x86_64/src/vlib/main.c:1238 > > #14 vlib_main_or_worker_loop (vm=0x7f3f30239780, is_main=0) at > > /usr/src/debug/vpp-21.06.0-3~g50650da54_dirty.x86_64/src/vlib/main.c:1822 > > #15 vlib_worker_loop (vm=vm@entry=0x7f3f30239780) at /usr/src/debug/vpp- > > 21.06.0-3~g50650da54_dirty.x86_64/src/vlib/main.c:1956 > > #16 0x00007f3f5c48817d in vlib_worker_thread_fn (arg=<optimized out>) at > > /usr/src/debug/vpp-21.06.0- > > 3~g50650da54_dirty.x86_64/src/vlib/threads.c:1617 > > #17 0x00007f3f5c33f56c in clib_calljmp () at /usr/src/debug/vpp-21.06.0- > > 3~g50650da54_dirty.x86_64/src/vppinfra/longjmp.S:123 > > #18 0x00007f3bfb7fdc30 in ?? () > > #19 0x00007f3f5c46f0e7 in vlib_worker_thread_bootstrap_fn > > (arg=0x7f3edc721a40) at /usr/src/debug/vpp-21.06.0- > > 3~g50650da54_dirty.x86_64/src/vlib/threads.c:488 > > #20 0x0000000000000000 in ?? () > > (gdb) thread apply all bt > > > > > > couple of things we saw while debugging this problem: > > > > 1. pending_frame vector len is huge: > > (gdb) p nm->pending_frames > > $4 = (vlib_pending_frame_t *) 0x7f3f4532fe30 > > (gdb) get_vec_len 0x7f3f4532fe30 > > $5 = 10236250 > > (gdb) > > > > 2. pending_frame vector has duplicate entries: > > (gdb) p nm->pending_frames[0] > > $10 = {frame = 0x7f3f30709b80, node_runtime_index = 648, next_frame_index > > = 4294967295} > > (gdb) p nm->pending_frames[1] > > $11 = {frame = 0x7f3f30709280, node_runtime_index = 550, next_frame_index > > = 2299} > > (gdb) p nm->pending_frames[2] > > $12 = {frame = 0x7f3f302e2140, node_runtime_index = 548, next_frame_index > > = 2315} > > (gdb) p nm->pending_frames[3] > > $13 = {frame = 0x7f3f302e2a80, node_runtime_index = 569, next_frame_index > > = 949} > > (gdb) p nm->pending_frames[4] > > $14 = {frame = 0x7f3f302e4140, node_runtime_index = 567, next_frame_index > > = 2629} > > (gdb) p nm->pending_frames[5] > > $15 = {frame = 0x7f3f30523280, node_runtime_index = 251, next_frame_index > > = 2605} > > (gdb) p nm->pending_frames[6] > > $16 = {frame = 0x7f3f302e2a80, node_runtime_index = 569, next_frame_index > > = 949} > > (gdb) p nm->pending_frames[7] > > $17 = {frame = 0x7f3f302e4140, node_runtime_index = 567, next_frame_index > > = 2629} > > (gdb) p nm->pending_frames[8] > > $18 = {frame = 0x7f3f30523280, node_runtime_index = 251, next_frame_index > > = 2605} > > (gdb) p nm->pending_frames[100] > > $19 = {frame = 0x7f3f302e4140, node_runtime_index = 567, next_frame_index > > = 2629} > > (gdb) p nm->pending_frames[101] > > $20 = {frame = 0x7f3f30523280, node_runtime_index = 251, next_frame_index > > = 2605} > > (gdb) p nm->pending_frames[102] > > $21 = {frame = 0x7f3f302e2a80, node_runtime_index = 569, next_frame_index > > = 949} > > > > 3. all the frame points to same buffer: > > (gdb) get_buf_index_from_frame 0x7f3f30709b80 > > $56 = (u32 *) 0x7f3f30709b90 > > (gdb) p *$56 > > $57 = 4988539 > > (gdb) get_buf_index_from_frame 0x7f3f30709280 > > $58 = (u32 *) 0x7f3f30709290 > > (gdb) p *$58 > > $59 = 4988539 > > (gdb) get_buf_index_from_frame 0x7f3f302e2140 > > $60 = (u32 *) 0x7f3f302e2150 > > (gdb) p *$60 > > > > $61 = 4988539 > > (gdb) get_buf_index_from_frame 0x7f3f302e2a80 > > $62 = (u32 *) 0x7f3f302e2a90 > > (gdb) p *$62 > > $63 = 4988539 > > (gdb) get_buf_index_from_frame 0x7f3f302e4140 > > $64 = (u32 *) 0x7f3f302e4150 > > (gdb) p *$64 > > $65 = 4988539 > > (gdb) get_buf_index_from_frame 0x7f3f30523280 > > $66 = (u32 *) 0x7f3f30523290 > > (gdb) p *$66 > > $67 = 4988539 > > (gdb) > > > > 4. buffer data shows that it's a v6 BGP packet. > > > > 5. next_frame vector: > > (gdb) p nm->next_frames > > $50 = (vlib_next_frame_t *) 0x7f3f30852c00 > > (gdb) p nm->next_frames[2605] > > $51 = {frame = 0x0, node_runtime_index = 251, flags = 32772, > > vectors_since_last_overflow = 3412084} > > (gdb) p /t nm->next_frames[2605].flags > > $52 = 1000000000000100 > > (gdb) p nm->nodes_by_type[VLIB_NODE_TYPE_INTERNAL] > > $53 = (vlib_node_runtime_t *) 0x7f3f30b2ad00 > > (gdb) p nm->nodes_by_type[VLIB_NODE_TYPE_INTERNAL][251] > > $54 = {cacheline0 = 0x7f3f30b32a80 "", function = 0x7f3edac5b300 > > <custom_input_node_fn>, errors = 0x7f3edcaf89e0, > > clocks_since_last_overflow = 1303441754, max_clock = 17932868, > max_clock_n > > = 1, > > calls_since_last_overflow = 3412083, vectors_since_last_overflow = > > 3412083, next_frame_index = 947, node_index = 291, > > input_main_loops_per_call = 0, main_loop_count_last_dispatch = 92468689, > > main_loop_vector_stats = {0, 3412081}, > > flags = 0, state = 0, n_next_nodes = 3, cached_next_index = 2, > > thread_index = 3, runtime_data = 0x7f3f30b32ac6 ""} > > (gdb) > > (gdb) get_vec_len nm->next_frames > > $69 = 4005 > > (gdb) > > > > 6. node runtime index to name mapping: > > (gdb) p nm->nodes[648].name > > $70 = (u8 *) 0x7f3edcd6e130 "ip4-sv-reassembly" > > (gdb) p nm->nodes[550].name > > $71 = (u8 *) 0x7f3edcd3f630 "esp4-decrypt-post" > > (gdb) p nm->nodes[548].name > > $72 = (u8 *) 0x7f3edcd3dec0 "esp6-decrypt-post" > > (gdb) p nm->nodes[569].name > > $73 = (u8 *) 0x7f3edcd48680 "ah4-decrypt-handoff" > > (gdb) p nm->nodes[567].name > > $74 = (u8 *) 0x7f3edcd47cf0 "ipsec4-input-feature" > > (gdb) p nm->nodes[251].name > > > > $75 = (u8 *) 0x7f3edcac17c0 "dns46_reply" > > (gdb) > > > > > > Thanks > > Vipin A. >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#20640): https://lists.fd.io/g/vpp-dev/message/20640 Mute This Topic: https://lists.fd.io/mt/87717737/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-