PTAL  https://gerrit.fd.io/r/c/vpp/+/36334



On Fri, Jun 3, 2022 at 10:25 PM Pim van Pelt via lists.fd.io <pim=
ipng...@lists.fd.io> wrote:

> Hi Damjan,
>
> Just a quick note - 22.06 still has this regression
>
> 1: /home/pim/src/vpp/src/vlib/drop.c:77 (counter_index) assertion `ci <
> n->n_errors' fails
>
>
> Is a reasonable fix for this seeing to it that the ASSERT here returns
> NULL instead and the two call sites in L95, L224 become tolerant of that?
>
> On Thu, Apr 7, 2022 at 3:11 PM Damjan Marion (damarion) <
> damar...@cisco.com> wrote:
>
>>
>> Yeah, looks like ip4_neighbor_probe is sending packet to deleted
>> interface:
>>
>> (gdb)p n->name
>> $4 = (u8 *) 0x7fff82b47578 "interface-3-output-deleted”
>>
>> So it is right that this assert kicks in.
>>
>> Likely what happens is that batch of commands are first triggering
>> generation of neighbor probe packet, then
>> immediately after that interface is deleted, but packet is still in
>> flight and drop node tries to bump counters for deleted interface.
>>
>> —
>> Damjan
>>
>>
>>
>> > On 06.04.2022., at 16:21, Pim van Pelt <p...@ipng.nl> wrote:
>> >
>> > Hoi,
>> >
>> > Following reproduces the drop.c:77 assertion:
>> >
>> > create loopback interface instance 0
>> > set interface ip address loop0 10.0.0.1/32
>> > set interface state GigabitEthernet3/0/1 up
>> > set interface state loop0 up
>> > set interface state loop0 down
>> > set interface ip address del loop0 10.0.0.1/32
>> > delete loopback interface intfc loop0
>> > set interface state GigabitEthernet3/0/1 down
>> > set interface state GigabitEthernet3/0/1 up
>> > comment { the following crashes VPP }
>> > set interface state GigabitEthernet3/0/1 down
>> >
>> > I found that adding IPv6 addresses does not provoke the crash, while
>> adding IPv4 addresses to loop0 does provoke it.
>> >
>> > groet,
>> > Pim
>> >
>> > On Wed, Apr 6, 2022 at 3:56 PM Pim van Pelt via lists.fd.io <pim=
>> ipng...@lists.fd.io> wrote:
>> > Hoi,
>> >
>> > The crash I observed is now gone, thanks!
>> >
>> > VPP occasionally hits an ASSERT related to error counters at drop.c:77
>> -- I'll try to see if I can get a reproduction, but it may take a while,
>> and it may be transient.
>> >
>> > 11: /home/pim/src/vpp/src/vlib/drop.c:77 (counter_index) assertion `ci
>> < n->n_errors' fails
>> >
>> > Thread 14 "vpp_wk_11" received signal SIGABRT, Aborted.
>> > [Switching to Thread 0x7fff4bbfd700 (LWP 182685)]
>> > __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
>> > 50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
>> > (gdb) bt
>> > #0  __GI_raise (sig=sig@entry=6) at
>> ../sysdeps/unix/sysv/linux/raise.c:50
>> > #1  0x00007ffff6a5f859 in __GI_abort () at abort.c:79
>> > #2  0x00000000004072e3 in os_panic () at
>> /home/pim/src/vpp/src/vpp/vnet/main.c:413
>> > #3  0x00007ffff6daea29 in debugger () at
>> /home/pim/src/vpp/src/vppinfra/error.c:84
>> > #4  0x00007ffff6dae7fa in _clib_error (how_to_die=2, function_name=0x0,
>> line_number=0, fmt=0x7ffff6f9d19c "%s:%d (%s) assertion `%s' fails")
>> >     at /home/pim/src/vpp/src/vppinfra/error.c:143
>> > #5  0x00007ffff6f782d9 in counter_index (vm=0x7fffa09fb2c0, e=3416) at
>> /home/pim/src/vpp/src/vlib/drop.c:77
>> > #6  0x00007ffff6f77c57 in process_drop_punt (vm=0x7fffa09fb2c0,
>> node=0x7fffa0c79b00, frame=0x7fff97168140,
>> disposition=ERROR_DISPOSITION_DROP)
>> >     at /home/pim/src/vpp/src/vlib/drop.c:224
>> > #7  0x00007ffff6f77957 in error_drop_node_fn_hsw (vm=0x7fffa09fb2c0,
>> node=0x7fffa0c79b00, frame=0x7fff97168140)
>> >     at /home/pim/src/vpp/src/vlib/drop.c:248
>> > #8  0x00007ffff6f0b10d in dispatch_node (vm=0x7fffa09fb2c0,
>> node=0x7fffa0c79b00, type=VLIB_NODE_TYPE_INTERNAL,
>> >     dispatch_state=VLIB_NODE_STATE_POLLING, frame=0x7fff97168140,
>> last_time_stamp=5318787653101516) at /home/pim/src/vpp/src/vlib/main.c:961
>> > #9  0x00007ffff6f0bb60 in dispatch_pending_node (vm=0x7fffa09fb2c0,
>> pending_frame_index=5, last_time_stamp=5318787653101516)
>> >     at /home/pim/src/vpp/src/vlib/main.c:1120
>> > #10 0x00007ffff6f06e0f in vlib_main_or_worker_loop (vm=0x7fffa09fb2c0,
>> is_main=0) at /home/pim/src/vpp/src/vlib/main.c:1587
>> > #11 0x00007ffff6f06537 in vlib_worker_loop (vm=0x7fffa09fb2c0) at
>> /home/pim/src/vpp/src/vlib/main.c:1721
>> > #12 0x00007ffff6f44ef4 in vlib_worker_thread_fn (arg=0x7fff98eabec0) at
>> /home/pim/src/vpp/src/vlib/threads.c:1587
>> > #13 0x00007ffff6f3ffe5 in vlib_worker_thread_bootstrap_fn
>> (arg=0x7fff98eabec0) at /home/pim/src/vpp/src/vlib/threads.c:426
>> > #14 0x00007ffff6e61609 in start_thread (arg=<optimized out>) at
>> pthread_create.c:477
>> > #15 0x00007ffff6b5c163 in clone () at
>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>> > (gdb) up 4
>> > #4  0x00007ffff6dae7fa in _clib_error (how_to_die=2, function_name=0x0,
>> line_number=0, fmt=0x7ffff6f9d19c "%s:%d (%s) assertion `%s' fails")
>> >     at /home/pim/src/vpp/src/vppinfra/error.c:143
>> > 143         debugger ();
>> > (gdb) up
>> > #5  0x00007ffff6f782d9 in counter_index (vm=0x7fffa09fb2c0, e=3416) at
>> /home/pim/src/vpp/src/vlib/drop.c:77
>> > 77        ASSERT (ci < n->n_errors);
>> > (gdb) list
>> > 72
>> > 73        ni = vlib_error_get_node (&vm->node_main, e);
>> > 74        n = vlib_get_node (vm, ni);
>> > 75
>> > 76        ci = vlib_error_get_code (&vm->node_main, e);
>> > 77        ASSERT (ci < n->n_errors);
>> > 78
>> > 79        ci += n->error_heap_index;
>> > 80
>> > 81        return ci;
>> >
>> > On Wed, Apr 6, 2022 at 1:53 PM Damjan Marion (damarion) <
>> damar...@cisco.com> wrote:
>> >
>> > This seems to be day one issue, and my patch just exposed it.
>> > Current interface deletion code is not removing node stats entries.
>> >
>> > So if you delete interface and then create one with the same name,
>> > stats entry is already there, and creation of new entry fails.
>> >
>> > Hope this helps:
>> >
>> > https://gerrit.fd.io/r/c/vpp/+/35900
>> >
>> > —
>> > Damjan
>> >
>> >
>> >
>> > > On 05.04.2022., at 22:13, Pim van Pelt <p...@ipng.nl> wrote:
>> > >
>> > > Hoi,
>> > >
>> > > Here's a minimal repro that reliably crashes VPP at head for me, does
>> not crash before gerrit 35640:
>> > >
>> > > create loopback interface instance 0
>> > > create bond id 0 mode lacp load-balance l34
>> > > create bond id 1 mode lacp load-balance l34
>> > > delete loopback interface intfc loop0
>> > > delete bond BondEthernet0
>> > > delete bond BondEthernet1
>> > > create bond id 0 mode lacp load-balance l34
>> > > delete bond BondEthernet0
>> > > comment { the next command crashes VPP }
>> > > create loopback interface instance 0
>> > >
>> > >
>> > >
>> > > On Tue, Apr 5, 2022 at 9:48 PM Pim van Pelt <p...@ipng.nl> wrote:
>> > > Hoi,
>> > >
>> > > There is a crashing regression in VPP after
>> https://gerrit.fd.io/r/c/vpp/+/35640
>> > >
>> > > With that change merged, VPP crashes upon creation and deletion of
>> interfaces. Winding back the repo until before 35640 does not crash. The
>> crash happens in
>> > > 0: /home/pim/src/vpp/src/vlib/stats/stats.h:115
>> (vlib_stats_get_entry) assertion `entry_index < vec_len
>> (sm->directory_vector)' fails
>> > >
>> > > (gdb) bt
>> > > #0  __GI_raise (sig=sig@entry=6) at
>> ../sysdeps/unix/sysv/linux/raise.c:50
>> > > #1  0x00007ffff6a5e859 in __GI_abort () at abort.c:79
>> > > #2  0x00000000004072e3 in os_panic () at
>> /home/pim/src/vpp/src/vpp/vnet/main.c:413
>> > > #3  0x00007ffff6dada29 in debugger () at
>> /home/pim/src/vpp/src/vppinfra/error.c:84
>> > > #4  0x00007ffff6dad7fa in _clib_error (how_to_die=2,
>> function_name=0x0, line_number=0, fmt=0x7ffff6f9c19c "%s:%d (%s) assertion
>> `%s' fails")
>> > >    at /home/pim/src/vpp/src/vppinfra/error.c:143
>> > > #5  0x00007ffff6f39605 in vlib_stats_get_entry (sm=0x7ffff6fce5e8
>> <vlib_stats_main>, entry_index=4294967295)
>> > >    at /home/pim/src/vpp/src/vlib/stats/stats.h:115
>> > > #6  0x00007ffff6f39273 in vlib_stats_remove_entry
>> (entry_index=4294967295) at /home/pim/src/vpp/src/vlib/stats/stats.c:135
>> > > #7  0x00007ffff6ee36d9 in vlib_register_errors (vm=0x7fff96800740,
>> node_index=718, n_errors=0, error_strings=0x0, counters=0x0)
>> > >    at /home/pim/src/vpp/src/vlib/error.c:149
>> > > #8  0x00007ffff70b8e0c in setup_tx_node (vm=0x7fff96800740,
>> node_index=718, dev_class=0x7fff973f9fb0) at
>> /home/pim/src/vpp/src/vnet/interface.c:816
>> > > #9  0x00007ffff70b7f26 in vnet_register_interface (vnm=0x7ffff7f579a0
>> <vnet_main>, dev_class_index=31, dev_instance=0, hw_class_index=29,
>> > >    hw_instance=7) at /home/pim/src/vpp/src/vnet/interface.c:1085
>> > > #10 0x00007ffff7129efd in vnet_eth_register_interface
>> (vnm=0x7ffff7f579a0 <vnet_main>, r=0x7fff4b288f18)
>> > >    at /home/pim/src/vpp/src/vnet/ethernet/interface.c:376
>> > > #11 0x00007ffff712bd05 in vnet_create_loopback_interface
>> (sw_if_indexp=0x7fff4b288fb8, mac_address=0x7fff4b288fb2 "", is_specified=1
>> '\001',
>> > >    user_instance=0) at
>> /home/pim/src/vpp/src/vnet/ethernet/interface.c:883
>> > > #12 0x00007ffff712fecf in create_simulated_ethernet_interfaces
>> (vm=0x7fff96800740, input=0x7fff4b2899d0, cmd=0x7fff973c7e38)
>> > >    at /home/pim/src/vpp/src/vnet/ethernet/interface.c:930
>> > > #13 0x00007ffff6ed65e8 in vlib_cli_dispatch_sub_commands
>> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>,
>> input=0x7fff4b2899d0,
>> > >    parent_command_index=1161) at /home/pim/src/vpp/src/vlib/cli.c:592
>> > > #14 0x00007ffff6ed6358 in vlib_cli_dispatch_sub_commands
>> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>,
>> input=0x7fff4b2899d0,
>> > >    parent_command_index=33) at /home/pim/src/vpp/src/vlib/cli.c:549
>> > > #15 0x00007ffff6ed6358 in vlib_cli_dispatch_sub_commands
>> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>,
>> input=0x7fff4b2899d0,
>> > >    parent_command_index=0) at /home/pim/src/vpp/src/vlib/cli.c:549
>> > > #16 0x00007ffff6ed5528 in vlib_cli_input (vm=0x7fff96800740,
>> input=0x7fff4b2899d0, function=0x0, function_arg=0)
>> > >    at /home/pim/src/vpp/src/vlib/cli.c:695
>> > > #17 0x00007ffff6f61f21 in unix_cli_exec (vm=0x7fff96800740,
>> input=0x7fff4b289e78, cmd=0x7fff973c99d8) at
>> /home/pim/src/vpp/src/vlib/unix/cli.c:3454
>> > > #18 0x00007ffff6ed65e8 in vlib_cli_dispatch_sub_commands
>> (vm=0x7fff96800740, cm=0x42c2f0 <vlib_global_main+48>,
>> input=0x7fff4b289e78,
>> > >    parent_command_index=0) at /home/pim/src/vpp/src/vlib/cli.c:592
>> > > #19 0x00007ffff6ed5528 in vlib_cli_input (vm=0x7fff96800740,
>> input=0x7fff4b289e78, function=0x7ffff6f55960 <unix_vlib_cli_output>,
>> function_arg=1)
>> > >    at /home/pim/src/vpp/src/vlib/cli.c:695
>> > >
>> > > This is caught by a local regression test (
>> https://github.com/pimvanpelt/vppcfg/tree/main/intest) that executes a
>> bunch of CLI statements, and I have a set of transitions there which I can
>> probably narrow down to an exact repro case.
>> > >
>> > > On Fri, Apr 1, 2022 at 3:08 PM Pim van Pelt via lists.fd.io <pim=
>> ipng...@lists.fd.io> wrote:
>> > > Hoi,
>> > >
>> > > As a followup - I tried to remember why I copied class VPPStats() and
>> friends into my own repository, but that may be because it's not exported
>> in __init__.py. Should it be? I pulled in the latest changed Damjan made to
>> vpp_stats.py into my own repo, and my app runs again. Is it possibly worth
>> our while to add the VPPStats() class to the exported classes in vpp_papi ?
>> > >
>> > > groet,
>> > > Pim
>> > >
>> > > On Fri, Apr 1, 2022 at 2:50 PM Pim van Pelt via lists.fd.io <pim=
>> ipng...@lists.fd.io> wrote:
>> > > Hoi,
>> > >
>> > > I noticed that my VPP SNMP Agent no longer works with the python API
>> at HEAD, and my attention was drawn to this change:
>> > > https://gerrit.fd.io/r/c/vpp/+/35640
>> > > stats: convert error counters to normal counters
>> > >
>> > >
>> > > At HEAD, src/vpp-api/python/vpp_papi/vpp_stats.py now fails 4 out of
>> 6 tests with the same error as my application:
>> > > struct.error: offset -140393469444104 out of range for
>> 1073741824-byte buffer
>> > > ..
>> > > Ran 6 tests in 0.612s
>> > > FAILED (errors=4)
>> > >
>> > > Damjan, Ole, any clues?
>> > >
>> > > groet,
>> > > Pim
>> > > --
>> > > Pim van Pelt <p...@ipng.nl>
>> > > PBVP1-RIPE - http://www.ipng.nl/
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > --
>> > > Pim van Pelt <p...@ipng.nl>
>> > > PBVP1-RIPE - http://www.ipng.nl/
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > --
>> > > Pim van Pelt <p...@ipng.nl>
>> > > PBVP1-RIPE - http://www.ipng.nl/
>> > >
>> > >
>> > > --
>> > > Pim van Pelt <p...@ipng.nl>
>> > > PBVP1-RIPE - http://www.ipng.nl/
>> > >
>> > >
>> > >
>> >
>> >
>> >
>> > --
>> > Pim van Pelt <p...@ipng.nl>
>> > PBVP1-RIPE - http://www.ipng.nl/
>> >
>> >
>> >
>> >
>> >
>> > --
>> > Pim van Pelt <p...@ipng.nl>
>> > PBVP1-RIPE - http://www.ipng.nl/
>> >
>> >
>> >
>>
>>
>
> --
> Pim van Pelt <p...@ipng.nl>
> PBVP1-RIPE - http://www.ipng.nl/
>
> 
>
>

-- 
Pim van Pelt <p...@ipng.nl>
PBVP1-RIPE - http://www.ipng.nl/
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#21504): https://lists.fd.io/g/vpp-dev/message/21504
Mute This Topic: https://lists.fd.io/mt/90274515/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to