PTAL https://gerrit.fd.io/r/c/vpp/+/36334
On Fri, Jun 3, 2022 at 10:25 PM Pim van Pelt via lists.fd.io <pim= ipng...@lists.fd.io> wrote: > Hi Damjan, > > Just a quick note - 22.06 still has this regression > > 1: /home/pim/src/vpp/src/vlib/drop.c:77 (counter_index) assertion `ci < > n->n_errors' fails > > > Is a reasonable fix for this seeing to it that the ASSERT here returns > NULL instead and the two call sites in L95, L224 become tolerant of that? > > On Thu, Apr 7, 2022 at 3:11 PM Damjan Marion (damarion) < > damar...@cisco.com> wrote: > >> >> Yeah, looks like ip4_neighbor_probe is sending packet to deleted >> interface: >> >> (gdb)p n->name >> $4 = (u8 *) 0x7fff82b47578 "interface-3-output-deleted” >> >> So it is right that this assert kicks in. >> >> Likely what happens is that batch of commands are first triggering >> generation of neighbor probe packet, then >> immediately after that interface is deleted, but packet is still in >> flight and drop node tries to bump counters for deleted interface. >> >> — >> Damjan >> >> >> >> > On 06.04.2022., at 16:21, Pim van Pelt <p...@ipng.nl> wrote: >> > >> > Hoi, >> > >> > Following reproduces the drop.c:77 assertion: >> > >> > create loopback interface instance 0 >> > set interface ip address loop0 10.0.0.1/32 >> > set interface state GigabitEthernet3/0/1 up >> > set interface state loop0 up >> > set interface state loop0 down >> > set interface ip address del loop0 10.0.0.1/32 >> > delete loopback interface intfc loop0 >> > set interface state GigabitEthernet3/0/1 down >> > set interface state GigabitEthernet3/0/1 up >> > comment { the following crashes VPP } >> > set interface state GigabitEthernet3/0/1 down >> > >> > I found that adding IPv6 addresses does not provoke the crash, while >> adding IPv4 addresses to loop0 does provoke it. >> > >> > groet, >> > Pim >> > >> > On Wed, Apr 6, 2022 at 3:56 PM Pim van Pelt via lists.fd.io <pim= >> ipng...@lists.fd.io> wrote: >> > Hoi, >> > >> > The crash I observed is now gone, thanks! >> > >> > VPP occasionally hits an ASSERT related to error counters at drop.c:77 >> -- I'll try to see if I can get a reproduction, but it may take a while, >> and it may be transient. >> > >> > 11: /home/pim/src/vpp/src/vlib/drop.c:77 (counter_index) assertion `ci >> < n->n_errors' fails >> > >> > Thread 14 "vpp_wk_11" received signal SIGABRT, Aborted. >> > [Switching to Thread 0x7fff4bbfd700 (LWP 182685)] >> > __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 >> > 50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. >> > (gdb) bt >> > #0 __GI_raise (sig=sig@entry=6) at >> ../sysdeps/unix/sysv/linux/raise.c:50 >> > #1 0x00007ffff6a5f859 in __GI_abort () at abort.c:79 >> > #2 0x00000000004072e3 in os_panic () at >> /home/pim/src/vpp/src/vpp/vnet/main.c:413 >> > #3 0x00007ffff6daea29 in debugger () at >> /home/pim/src/vpp/src/vppinfra/error.c:84 >> > #4 0x00007ffff6dae7fa in _clib_error (how_to_die=2, function_name=0x0, >> line_number=0, fmt=0x7ffff6f9d19c "%s:%d (%s) assertion `%s' fails") >> > at /home/pim/src/vpp/src/vppinfra/error.c:143 >> > #5 0x00007ffff6f782d9 in counter_index (vm=0x7fffa09fb2c0, e=3416) at >> /home/pim/src/vpp/src/vlib/drop.c:77 >> > #6 0x00007ffff6f77c57 in process_drop_punt (vm=0x7fffa09fb2c0, >> node=0x7fffa0c79b00, frame=0x7fff97168140, >> disposition=ERROR_DISPOSITION_DROP) >> > at /home/pim/src/vpp/src/vlib/drop.c:224 >> > #7 0x00007ffff6f77957 in error_drop_node_fn_hsw (vm=0x7fffa09fb2c0, >> node=0x7fffa0c79b00, frame=0x7fff97168140) >> > at /home/pim/src/vpp/src/vlib/drop.c:248 >> > #8 0x00007ffff6f0b10d in dispatch_node (vm=0x7fffa09fb2c0, >> node=0x7fffa0c79b00, type=VLIB_NODE_TYPE_INTERNAL, >> > dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x7fff97168140, >> last_time_stamp=5318787653101516) at /home/pim/src/vpp/src/vlib/main.c:961 >> > #9 0x00007ffff6f0bb60 in dispatch_pending_node (vm=0x7fffa09fb2c0, >> pending_frame_index=5, last_time_stamp=5318787653101516) >> > at /home/pim/src/vpp/src/vlib/main.c:1120 >> > #10 0x00007ffff6f06e0f in vlib_main_or_worker_loop (vm=0x7fffa09fb2c0, >> is_main=0) at /home/pim/src/vpp/src/vlib/main.c:1587 >> > #11 0x00007ffff6f06537 in vlib_worker_loop (vm=0x7fffa09fb2c0) at >> /home/pim/src/vpp/src/vlib/main.c:1721 >> > #12 0x00007ffff6f44ef4 in vlib_worker_thread_fn (arg=0x7fff98eabec0) at >> /home/pim/src/vpp/src/vlib/threads.c:1587 >> > #13 0x00007ffff6f3ffe5 in vlib_worker_thread_bootstrap_fn >> (arg=0x7fff98eabec0) at /home/pim/src/vpp/src/vlib/threads.c:426 >> > #14 0x00007ffff6e61609 in start_thread (arg=<optimized out>) at >> pthread_create.c:477 >> > #15 0x00007ffff6b5c163 in clone () at >> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 >> > (gdb) up 4 >> > #4 0x00007ffff6dae7fa in _clib_error (how_to_die=2, function_name=0x0, >> line_number=0, fmt=0x7ffff6f9d19c "%s:%d (%s) assertion `%s' fails") >> > at /home/pim/src/vpp/src/vppinfra/error.c:143 >> > 143 debugger (); >> > (gdb) up >> > #5 0x00007ffff6f782d9 in counter_index (vm=0x7fffa09fb2c0, e=3416) at >> /home/pim/src/vpp/src/vlib/drop.c:77 >> > 77 ASSERT (ci < n->n_errors); >> > (gdb) list >> > 72 >> > 73 ni = vlib_error_get_node (&vm->node_main, e); >> > 74 n = vlib_get_node (vm, ni); >> > 75 >> > 76 ci = vlib_error_get_code (&vm->node_main, e); >> > 77 ASSERT (ci < n->n_errors); >> > 78 >> > 79 ci += n->error_heap_index; >> > 80 >> > 81 return ci; >> > >> > On Wed, Apr 6, 2022 at 1:53 PM Damjan Marion (damarion) < >> damar...@cisco.com> wrote: >> > >> > This seems to be day one issue, and my patch just exposed it. >> > Current interface deletion code is not removing node stats entries. >> > >> > So if you delete interface and then create one with the same name, >> > stats entry is already there, and creation of new entry fails. >> > >> > Hope this helps: >> > >> > https://gerrit.fd.io/r/c/vpp/+/35900 >> > >> > — >> > Damjan >> > >> > >> > >> > > On 05.04.2022., at 22:13, Pim van Pelt <p...@ipng.nl> wrote: >> > > >> > > Hoi, >> > > >> > > Here's a minimal repro that reliably crashes VPP at head for me, does >> not crash before gerrit 35640: >> > > >> > > create loopback interface instance 0 >> > > create bond id 0 mode lacp load-balance l34 >> > > create bond id 1 mode lacp load-balance l34 >> > > delete loopback interface intfc loop0 >> > > delete bond BondEthernet0 >> > > delete bond BondEthernet1 >> > > create bond id 0 mode lacp load-balance l34 >> > > delete bond BondEthernet0 >> > > comment { the next command crashes VPP } >> > > create loopback interface instance 0 >> > > >> > > >> > > >> > > On Tue, Apr 5, 2022 at 9:48 PM Pim van Pelt <p...@ipng.nl> wrote: >> > > Hoi, >> > > >> > > There is a crashing regression in VPP after >> https://gerrit.fd.io/r/c/vpp/+/35640 >> > > >> > > With that change merged, VPP crashes upon creation and deletion of >> interfaces. Winding back the repo until before 35640 does not crash. The >> crash happens in >> > > 0: /home/pim/src/vpp/src/vlib/stats/stats.h:115 >> (vlib_stats_get_entry) assertion `entry_index < vec_len >> (sm->directory_vector)' fails >> > > >> > > (gdb) bt >> > > #0 __GI_raise (sig=sig@entry=6) at >> ../sysdeps/unix/sysv/linux/raise.c:50 >> > > #1 0x00007ffff6a5e859 in __GI_abort () at abort.c:79 >> > > #2 0x00000000004072e3 in os_panic () at >> /home/pim/src/vpp/src/vpp/vnet/main.c:413 >> > > #3 0x00007ffff6dada29 in debugger () at >> /home/pim/src/vpp/src/vppinfra/error.c:84 >> > > #4 0x00007ffff6dad7fa in _clib_error (how_to_die=2, >> function_name=0x0, line_number=0, fmt=0x7ffff6f9c19c "%s:%d (%s) assertion >> `%s' fails") >> > > at /home/pim/src/vpp/src/vppinfra/error.c:143 >> > > #5 0x00007ffff6f39605 in vlib_stats_get_entry (sm=0x7ffff6fce5e8 >> <vlib_stats_main>, entry_index=4294967295) >> > > at /home/pim/src/vpp/src/vlib/stats/stats.h:115 >> > > #6 0x00007ffff6f39273 in vlib_stats_remove_entry >> (entry_index=4294967295) at /home/pim/src/vpp/src/vlib/stats/stats.c:135 >> > > #7 0x00007ffff6ee36d9 in vlib_register_errors (vm=0x7fff96800740, >> node_index=718, n_errors=0, error_strings=0x0, counters=0x0) >> > > at /home/pim/src/vpp/src/vlib/error.c:149 >> > > #8 0x00007ffff70b8e0c in setup_tx_node (vm=0x7fff96800740, >> node_index=718, dev_class=0x7fff973f9fb0) at >> /home/pim/src/vpp/src/vnet/interface.c:816 >> > > #9 0x00007ffff70b7f26 in vnet_register_interface (vnm=0x7ffff7f579a0 >> <vnet_main>, dev_class_index=31, dev_instance=0, hw_class_index=29, >> > > hw_instance=7) at /home/pim/src/vpp/src/vnet/interface.c:1085 >> > > #10 0x00007ffff7129efd in vnet_eth_register_interface >> (vnm=0x7ffff7f579a0 <vnet_main>, r=0x7fff4b288f18) >> > > at /home/pim/src/vpp/src/vnet/ethernet/interface.c:376 >> > > #11 0x00007ffff712bd05 in vnet_create_loopback_interface >> (sw_if_indexp=0x7fff4b288fb8, mac_address=0x7fff4b288fb2 "", is_specified=1 >> '\001', >> > > user_instance=0) at >> /home/pim/src/vpp/src/vnet/ethernet/interface.c:883 >> > > #12 0x00007ffff712fecf in create_simulated_ethernet_interfaces >> (vm=0x7fff96800740, input=0x7fff4b2899d0, cmd=0x7fff973c7e38) >> > > at /home/pim/src/vpp/src/vnet/ethernet/interface.c:930 >> > > #13 0x00007ffff6ed65e8 in vlib_cli_dispatch_sub_commands >> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>, >> input=0x7fff4b2899d0, >> > > parent_command_index=1161) at /home/pim/src/vpp/src/vlib/cli.c:592 >> > > #14 0x00007ffff6ed6358 in vlib_cli_dispatch_sub_commands >> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>, >> input=0x7fff4b2899d0, >> > > parent_command_index=33) at /home/pim/src/vpp/src/vlib/cli.c:549 >> > > #15 0x00007ffff6ed6358 in vlib_cli_dispatch_sub_commands >> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>, >> input=0x7fff4b2899d0, >> > > parent_command_index=0) at /home/pim/src/vpp/src/vlib/cli.c:549 >> > > #16 0x00007ffff6ed5528 in vlib_cli_input (vm=0x7fff96800740, >> input=0x7fff4b2899d0, function=0x0, function_arg=0) >> > > at /home/pim/src/vpp/src/vlib/cli.c:695 >> > > #17 0x00007ffff6f61f21 in unix_cli_exec (vm=0x7fff96800740, >> input=0x7fff4b289e78, cmd=0x7fff973c99d8) at >> /home/pim/src/vpp/src/vlib/unix/cli.c:3454 >> > > #18 0x00007ffff6ed65e8 in vlib_cli_dispatch_sub_commands >> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>, >> input=0x7fff4b289e78, >> > > parent_command_index=0) at /home/pim/src/vpp/src/vlib/cli.c:592 >> > > #19 0x00007ffff6ed5528 in vlib_cli_input (vm=0x7fff96800740, >> input=0x7fff4b289e78, function=0x7ffff6f55960 <unix_vlib_cli_output>, >> function_arg=1) >> > > at /home/pim/src/vpp/src/vlib/cli.c:695 >> > > >> > > This is caught by a local regression test ( >> https://github.com/pimvanpelt/vppcfg/tree/main/intest) that executes a >> bunch of CLI statements, and I have a set of transitions there which I can >> probably narrow down to an exact repro case. >> > > >> > > On Fri, Apr 1, 2022 at 3:08 PM Pim van Pelt via lists.fd.io <pim= >> ipng...@lists.fd.io> wrote: >> > > Hoi, >> > > >> > > As a followup - I tried to remember why I copied class VPPStats() and >> friends into my own repository, but that may be because it's not exported >> in __init__.py. Should it be? I pulled in the latest changed Damjan made to >> vpp_stats.py into my own repo, and my app runs again. Is it possibly worth >> our while to add the VPPStats() class to the exported classes in vpp_papi ? >> > > >> > > groet, >> > > Pim >> > > >> > > On Fri, Apr 1, 2022 at 2:50 PM Pim van Pelt via lists.fd.io <pim= >> ipng...@lists.fd.io> wrote: >> > > Hoi, >> > > >> > > I noticed that my VPP SNMP Agent no longer works with the python API >> at HEAD, and my attention was drawn to this change: >> > > https://gerrit.fd.io/r/c/vpp/+/35640 >> > > stats: convert error counters to normal counters >> > > >> > > >> > > At HEAD, src/vpp-api/python/vpp_papi/vpp_stats.py now fails 4 out of >> 6 tests with the same error as my application: >> > > struct.error: offset -140393469444104 out of range for >> 1073741824-byte buffer >> > > .. >> > > Ran 6 tests in 0.612s >> > > FAILED (errors=4) >> > > >> > > Damjan, Ole, any clues? >> > > >> > > groet, >> > > Pim >> > > -- >> > > Pim van Pelt <p...@ipng.nl> >> > > PBVP1-RIPE - http://www.ipng.nl/ >> > > >> > > >> > > >> > > >> > > >> > > -- >> > > Pim van Pelt <p...@ipng.nl> >> > > PBVP1-RIPE - http://www.ipng.nl/ >> > > >> > > >> > > >> > > >> > > >> > > -- >> > > Pim van Pelt <p...@ipng.nl> >> > > PBVP1-RIPE - http://www.ipng.nl/ >> > > >> > > >> > > -- >> > > Pim van Pelt <p...@ipng.nl> >> > > PBVP1-RIPE - http://www.ipng.nl/ >> > > >> > > >> > > >> > >> > >> > >> > -- >> > Pim van Pelt <p...@ipng.nl> >> > PBVP1-RIPE - http://www.ipng.nl/ >> > >> > >> > >> > >> > >> > -- >> > Pim van Pelt <p...@ipng.nl> >> > PBVP1-RIPE - http://www.ipng.nl/ >> > >> > >> > >> >> > > -- > Pim van Pelt <p...@ipng.nl> > PBVP1-RIPE - http://www.ipng.nl/ > > > > -- Pim van Pelt <p...@ipng.nl> PBVP1-RIPE - http://www.ipng.nl/
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#21504): https://lists.fd.io/g/vpp-dev/message/21504 Mute This Topic: https://lists.fd.io/mt/90274515/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-