On 1/9/25 10:06, Friedrich Weber via discuss wrote:
> Hi,
> 
> (first reported this via other channels and was asked to repost here)
> 
> One of our Proxmox VE users reported that when hitting CTRL+C on a running
> `ovs-tcpdump` on a host with VM workloads, there is a low (<1/50) chance of a
> complete OVS network freeze (i.e., the host loses network connection on all
> OVS-controlled interfaces). Soft lockups are logged, and they need to
> hard-reset the host.
> 
> I believe I can reproduce the same issue in an Arch Linux VM (specs [1]) with
> the following steps:
> 
> - set up an OVS bridge with an active-backup bond0 and assign an IP [2]
> - install iperf2 and run a script [3] that spawns an iperf server, then
>   repeatedly performs the following:
>   1. start ovs-tcpdump on bond0, which spawns an "inner" tcpdump
>   2. concurrently, start ordinary tcpdump on mibond0 (the interface created by
>      ovs-tcpdump)
>   3. send SIGINT to the inner tcpdump from step 1. The tcpdump from step 2 
> exits
>      with "tcpdump: pcap_loop: The interface disappeared".
> - run long-running parallel iperf2 against bond0 from outside [4]. I'm running
>   the iperf from the hypervisor, achieving a cumulative bandwidth of 
> 55-60Gbit/s.
> - after 5-10min the host usually becomes unreachable via the bond (iperf 
> reports
>   zero bandwith), and a soft lockup is logged in the host [5]
> 
> Note that the user who originally reported this only starts one ovs-tcpdump
> process -- so there is probably some other unidentified factor that makes the
> issue more likely to trigger on their host.
> 
> The symptoms sound similar to the ones described in the kernel patch "net:
> openvswitch: fix race on port output" [6], but as far as I can tell, this 
> patch
> is already contained in 6.12.
> 

Thanks, Friedrich for the report and posting here!

As previously discussed, it is the same issue as the patch [6] attempted to fix,
but it wasn't fixed fully, because checking netif_carrier_ok() is not enough
for a dummy netdev that we use for ovs-tcpdump.

The dummy network device doesn't implement ndo_stop callback and so the carrier
status is not turned off while stopping the device.  We should also check that
device is running.

I'll run a few more tests and send out the fix.

Best regards, Ilya Maximets.

> Thanks,
> 
> Friedrich
> 
> [1]
> 
> - Hypervisor is Proxmox VE 8.3 (QEMU/KVM)
> - VM has 4 cores, 8G RAM, 3x virtio-net NICs (one for management, two for
>   bond0)
> - VM is running Arch Linux with kernel 6.12.8.arch1-1 (but I can also 
> reproduce
>   with a build of 6.13rc6), and openvswitch 3.4.1 (custom built package):
> 
> $ cat /proc/version
> Linux version 6.12.8-arch1-1 (linux@archlinux) (gcc (GCC) 14.2.1 20240910, 
> GNU ld (GNU Binutils) 2.43.0) #1 SMP PREEMPT_DYNAMIC Thu, 02 Jan 2025 
> 22:52:26 +0000
> $ ovs-vswitchd --version
> ovs-vswitchd (Open vSwitch) 3.4.1
> 
> [2]
> 
> # inside the VM
> ovs-vsctl add-br br0
> ip l set eth1 up
> ip l set eth2 up
> ovs-vsctl add-bond br0 bond0 eth1 eth2
> ip l set br0 up
> ip addr add 10.2.1.104/16 dev br0
> 
> [3]
> 
> # inside the VM
> iperf -s&
> while true;
> do
>       PYTHONPATH=/usr/share/openvswitch/python/ ovs-tcpdump -env -i bond0 tcp 
> port 12345 &
>       sleep 2
>       pid=$(pidof tcpdump)
>       echo pid: $pid
>       tcpdump -envi mibond0 tcp port 12345 &
>       sleep 5
>       echo kill
>       kill -INT $pid
>       sleep 3
> done
> 
> [4]
> 
> # from outside
> iperf -c 10.2.1.104 -i1 -t 600 -P128
> 
> [5]
> 
> Jan 08 09:01:57 arch-ovs ovs-vswitchd[446]: ovs|00074|bridge|INFO|bridge br0: 
> added interface mibond0 on port 13
> Jan 08 09:01:57 arch-ovs ovs-vswitchd[446]: 
> 2025-01-08T09:01:57Z|00074|bridge|INFO|bridge br0: added interface mibond0 on 
> port 13
> Jan 08 09:01:57 arch-ovs kernel: mibond0: entered promiscuous mode
> Jan 08 09:01:57 arch-ovs kernel: audit: type=1700 audit(1736326917.773:304): 
> dev=mibond0 prom=256 old_prom=0 auid=4294967295 uid=0 gid=0 ses=4294967295
> Jan 08 09:01:57 arch-ovs kernel: audit: type=1300 audit(1736326917.773:304): 
> arch=c000003e syscall=46 success=yes exit=52 a0=f a1=7ffcbc8bb550 a2=0 
> a3=55ab1849dd00 items=0 ppid=1 pid=446 auid=4294967295 uid=0 gid=0 euid=0 
> suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 
> comm="ovs-vswitchd" exe="/usr/bin/ovs-vswitchd" key=(null)
> Jan 08 09:01:57 arch-ovs kernel: audit: type=1327 audit(1736326917.773:304): 
> proctitle=2F7573722F7362696E2F6F76732D7673776974636864002D2D70696466696C653D2F72756E2F6F70656E767377697463682F6F76732D76737769746368642E706964
> Jan 08 09:02:04 arch-ovs systemd-networkd[479]: mibond0: Link DOWN
> Jan 08 09:02:04 arch-ovs kernel: mibond0 (unregistering): left promiscuous 
> mode
> Jan 08 09:02:04 arch-ovs kernel: audit: type=1700 audit(1736326924.733:305): 
> dev=mibond0 prom=0 old_prom=256 auid=1000 uid=0 gid=0 ses=3
> Jan 08 09:02:04 arch-ovs kernel: mibond0 selects TX queue 0, but real number 
> of TX queues is 0
> Jan 08 09:02:04 arch-ovs audit: ANOM_PROMISCUOUS dev=mibond0 prom=0 
> old_prom=256 auid=1000 uid=0 gid=0 ses=3
> Jan 08 09:02:04 arch-ovs systemd-networkd[479]: mibond0: Lost carrier
> Jan 08 09:02:30 arch-ovs kernel: watchdog: BUG: soft lockup - CPU#1 stuck for 
> 26s! [swapper/1:0]
> Jan 08 09:02:30 arch-ovs kernel: CPU#1 Utilization every 4s during lockup:
> Jan 08 09:02:30 arch-ovs kernel:         #1:   0% system,        100% 
> softirq,          0% hardirq,          0% idle
> Jan 08 09:02:30 arch-ovs kernel:         #2:   0% system,        101% 
> softirq,          0% hardirq,          0% idle
> Jan 08 09:02:30 arch-ovs kernel:         #3:   0% system,        100% 
> softirq,          0% hardirq,          0% idle
> Jan 08 09:02:30 arch-ovs kernel:         #4:   0% system,        100% 
> softirq,          1% hardirq,          0% idle
> Jan 08 09:02:30 arch-ovs kernel:         #5:   0% system,        101% 
> softirq,          0% hardirq,          0% idle
> Jan 08 09:02:30 arch-ovs kernel: Modules linked in: dummy cfg80211 rfkill 
> isofs nfnetlink_cttimeout openvswitch nsh nf_conncount nf_nat nf_conntrack 
> nf_defrag_ipv6 nf_defrag_ipv4 psample vfat fat sha512_ssse3 sha1_ssse3 
> aesni_intel gf128mul crypto_simd cryptd psmouse pcspkr i2c_piix4 joydev 
> i2c_smbus mousedev mac_hid loop dm_mod nfnetlink vsock_loopback 
> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci 
> qemu_fw_cfg ip_tables x_tables btrfs hid_generic blake2b_generic libcrc32c 
> usbhid crc32c_generic xor raid6_pq sr_mod cdrom bochs serio_raw ata_generic 
> drm_vram_helper atkbd pata_acpi drm_ttm_helper libps2 crc32c_intel 
> vivaldi_fmap intel_agp ttm sha256_ssse3 ata_piix virtio_net virtio_balloon 
> net_failover failover virtio_scsi intel_gtt i8042 floppy serio
> Jan 08 09:02:30 arch-ovs kernel: CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not 
> tainted 6.12.8-arch1-1 #1 099de49ddaebb26408f097c48b36e50b2c8e21c9
> Jan 08 09:02:30 arch-ovs kernel: Hardware name: QEMU Standard PC (i440FX + 
> PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> Jan 08 09:02:30 arch-ovs kernel: RIP: 0010:netdev_pick_tx+0x267/0x2b0
> Jan 08 09:02:30 arch-ovs kernel: Code: c2 48 c1 e8 20 44 01 f0 44 0f b7 f0 e9 
> c7 fe ff ff e8 ad 7d 5d ff 44 0f b6 04 24 e9 08 fe ff ff 45 89 c8 e9 6e fe ff 
> ff 29 c2 <39> c2 72 89 eb f8 48 85 db 41 bf ff ff ff ff 49 0f 44 dc e9 dd fd
> Jan 08 09:02:30 arch-ovs kernel: RSP: 0018:ffffa32280120648 EFLAGS: 00000246
> Jan 08 09:02:30 arch-ovs kernel: RAX: 0000000000000000 RBX: ffff96d343cd2000 
> RCX: 00000000000005a8
> Jan 08 09:02:30 arch-ovs kernel: RDX: 0000000000000000 RSI: ffff96d342208900 
> RDI: ffff96d340364c80
> Jan 08 09:02:30 arch-ovs kernel: RBP: ffff96d342208900 R08: 0000000000000000 
> R09: 0000000000000000
> Jan 08 09:02:30 arch-ovs kernel: R10: ffff96d342208900 R11: ffffa322801208b0 
> R12: ffff96d343cd2000
> Jan 08 09:02:30 arch-ovs kernel: R13: 0000000000000000 R14: 0000000000000000 
> R15: 00000000ffffffff
> Jan 08 09:02:30 arch-ovs kernel: FS:  0000000000000000(0000) 
> GS:ffff96d477c80000(0000) knlGS:0000000000000000
> Jan 08 09:02:30 arch-ovs kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
> 0000000080050033
> Jan 08 09:02:30 arch-ovs kernel: CR2: 000071216853a120 CR3: 0000000131ce4000 
> CR4: 00000000000006f0
> Jan 08 09:02:30 arch-ovs kernel: Call Trace:
> Jan 08 09:02:30 arch-ovs kernel:  <IRQ>
> Jan 08 09:02:30 arch-ovs kernel:  ? watchdog_timer_fn.cold+0x19c/0x219
> Jan 08 09:02:30 arch-ovs kernel:  ? __pfx_watchdog_timer_fn+0x10/0x10
> Jan 08 09:02:30 arch-ovs kernel:  ? __hrtimer_run_queues+0x132/0x2a0
> Jan 08 09:02:30 arch-ovs kernel:  ? hrtimer_interrupt+0xfa/0x210
> Jan 08 09:02:30 arch-ovs kernel:  ? __sysvec_apic_timer_interrupt+0x55/0x100
> Jan 08 09:02:30 arch-ovs kernel:  ? sysvec_apic_timer_interrupt+0x38/0x90
> Jan 08 09:02:30 arch-ovs kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
> Jan 08 09:02:30 arch-ovs kernel:  ? netdev_pick_tx+0x267/0x2b0
> Jan 08 09:02:30 arch-ovs kernel:  ? netdev_pick_tx+0x253/0x2b0
> Jan 08 09:02:30 arch-ovs kernel:  netdev_core_pick_tx+0xa1/0xb0
> Jan 08 09:02:30 arch-ovs kernel:  __dev_queue_xmit+0x19d/0xe70
> Jan 08 09:02:30 arch-ovs kernel:  ? kmem_cache_alloc_noprof+0x111/0x2f0
> Jan 08 09:02:30 arch-ovs kernel:  do_execute_actions+0xce/0x1b70 [openvswitch 
> d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
> Jan 08 09:02:30 arch-ovs kernel:  ? flow_lookup.isra.0+0x58/0x100 
> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
> Jan 08 09:02:30 arch-ovs kernel:  ovs_execute_actions+0x4c/0x130 [openvswitch 
> d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
> Jan 08 09:02:30 arch-ovs kernel:  ovs_dp_process_packet+0xa6/0x220 
> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
> Jan 08 09:02:30 arch-ovs kernel:  ? __pfx_netdev_frame_hook+0x10/0x10 
> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
> Jan 08 09:02:30 arch-ovs kernel:  ? __pfx_netdev_frame_hook+0x10/0x10 
> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
> Jan 08 09:02:30 arch-ovs kernel:  ovs_vport_receive+0x84/0xe0 [openvswitch 
> d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
> Jan 08 09:02:30 arch-ovs kernel:  netdev_frame_hook+0xd9/0x1a0 [openvswitch 
> d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
> Jan 08 09:02:30 arch-ovs kernel:  
> __netif_receive_skb_core.constprop.0+0x1fa/0x10b0
> Jan 08 09:02:30 arch-ovs kernel:  __netif_receive_skb_list_core+0x15d/0x300
> Jan 08 09:02:30 arch-ovs kernel:  netif_receive_skb_list_internal+0x1d4/0x310
> Jan 08 09:02:30 arch-ovs kernel:  napi_complete_done+0x72/0x220
> Jan 08 09:02:30 arch-ovs kernel:  virtnet_poll+0x6da/0xe62 [virtio_net 
> ba458d10bdb47849f4b70ed392bbaae27b08be62]
> Jan 08 09:02:30 arch-ovs kernel:  ? free_unref_page_commit+0x169/0x2e0
> Jan 08 09:02:30 arch-ovs kernel:  ? enqueue_hrtimer+0x35/0x90
> Jan 08 09:02:30 arch-ovs kernel:  __napi_poll+0x2b/0x160
> Jan 08 09:02:30 arch-ovs kernel:  net_rx_action+0x349/0x3e0
> Jan 08 09:02:30 arch-ovs kernel:  handle_softirqs+0xe4/0x2a0
> Jan 08 09:02:30 arch-ovs kernel:  __irq_exit_rcu+0x97/0xb0
> Jan 08 09:02:30 arch-ovs kernel:  common_interrupt+0x85/0xa0
> Jan 08 09:02:30 arch-ovs kernel:  </IRQ>
> Jan 08 09:02:30 arch-ovs kernel:  <TASK>
> Jan 08 09:02:30 arch-ovs kernel:  asm_common_interrupt+0x26/0x40
> Jan 08 09:02:30 arch-ovs kernel: RIP: 
> 0010:finish_task_switch.isra.0+0x9f/0x2e0
> Jan 08 09:02:30 arch-ovs kernel: Code: 34 00 00 00 00 0f 1f 44 00 00 4c 8b bb 
> d8 0c 00 00 4d 85 ff 0f 85 b7 00 00 00 66 90 48 89 df e8 27 e8 db 00 fb 0f 1f 
> 44 00 00 <66> 90 4d 85 ed 74 18 4d 3b ae 40 0a 00 00 0f 84 50 01 00 00 f0 41
> Jan 08 09:02:30 arch-ovs kernel: RSP: 0018:ffffa322800d3e18 EFLAGS: 00000282
> Jan 08 09:02:30 arch-ovs kernel: RAX: 0000000000000001 RBX: ffff96d477cb6c80 
> RCX: 0000000000000002
> Jan 08 09:02:30 arch-ovs kernel: RDX: 0000000000000000 RSI: 0000000000000001 
> RDI: ffff96d477cb6c80
> Jan 08 09:02:30 arch-ovs kernel: RBP: ffffa322800d3e48 R08: 0000000000000001 
> R09: 0000000000000000
> Jan 08 09:02:30 arch-ovs kernel: R10: 0000000000000001 R11: 0000000000000000 
> R12: ffff96d350b04c80
> Jan 08 09:02:30 arch-ovs kernel: R13: 0000000000000000 R14: ffff96d340364c80 
> R15: 0000000000000000
> Jan 08 09:02:30 arch-ovs kernel:  ? finish_task_switch.isra.0+0x99/0x2e0
> Jan 08 09:02:30 arch-ovs kernel:  __schedule+0x3b8/0x12b0
> Jan 08 09:02:30 arch-ovs kernel:  ? pv_native_safe_halt+0xf/0x20
> Jan 08 09:02:30 arch-ovs kernel:  schedule_idle+0x23/0x40
> Jan 08 09:02:30 arch-ovs kernel:  cpu_startup_entry+0x29/0x30
> Jan 08 09:02:30 arch-ovs kernel:  start_secondary+0x11e/0x140
> Jan 08 09:02:30 arch-ovs kernel:  common_startup_64+0x13e/0x141
> Jan 08 09:02:30 arch-ovs kernel:  </TASK>
> 
> [6] https://lore.kernel.org/lkml/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug/
> 
> _______________________________________________
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> 

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to