On 1/9/25 12:04, Ilya Maximets wrote: > On 1/9/25 10:06, Friedrich Weber via discuss wrote: >> Hi, >> >> (first reported this via other channels and was asked to repost here) >> >> One of our Proxmox VE users reported that when hitting CTRL+C on a running >> `ovs-tcpdump` on a host with VM workloads, there is a low (<1/50) chance of a >> complete OVS network freeze (i.e., the host loses network connection on all >> OVS-controlled interfaces). Soft lockups are logged, and they need to >> hard-reset the host. >> >> I believe I can reproduce the same issue in an Arch Linux VM (specs [1]) with >> the following steps: >> >> - set up an OVS bridge with an active-backup bond0 and assign an IP [2] >> - install iperf2 and run a script [3] that spawns an iperf server, then >> repeatedly performs the following: >> 1. start ovs-tcpdump on bond0, which spawns an "inner" tcpdump >> 2. concurrently, start ordinary tcpdump on mibond0 (the interface created >> by >> ovs-tcpdump) >> 3. send SIGINT to the inner tcpdump from step 1. The tcpdump from step 2 >> exits >> with "tcpdump: pcap_loop: The interface disappeared". >> - run long-running parallel iperf2 against bond0 from outside [4]. I'm >> running >> the iperf from the hypervisor, achieving a cumulative bandwidth of >> 55-60Gbit/s. >> - after 5-10min the host usually becomes unreachable via the bond (iperf >> reports >> zero bandwith), and a soft lockup is logged in the host [5] >> >> Note that the user who originally reported this only starts one ovs-tcpdump >> process -- so there is probably some other unidentified factor that makes the >> issue more likely to trigger on their host. >> >> The symptoms sound similar to the ones described in the kernel patch "net: >> openvswitch: fix race on port output" [6], but as far as I can tell, this >> patch >> is already contained in 6.12. >> > > Thanks, Friedrich for the report and posting here! > > As previously discussed, it is the same issue as the patch [6] attempted to > fix, > but it wasn't fixed fully, because checking netif_carrier_ok() is not enough > for a dummy netdev that we use for ovs-tcpdump. > > The dummy network device doesn't implement ndo_stop callback and so the > carrier > status is not turned off while stopping the device. We should also check that > device is running. > > I'll run a few more tests and send out the fix.
Sent: https://lore.kernel.org/netdev/20250109122225.4034688-1-i.maxim...@ovn.org/ > > Best regards, Ilya Maximets. > >> Thanks, >> >> Friedrich >> >> [1] >> >> - Hypervisor is Proxmox VE 8.3 (QEMU/KVM) >> - VM has 4 cores, 8G RAM, 3x virtio-net NICs (one for management, two for >> bond0) >> - VM is running Arch Linux with kernel 6.12.8.arch1-1 (but I can also >> reproduce >> with a build of 6.13rc6), and openvswitch 3.4.1 (custom built package): >> >> $ cat /proc/version >> Linux version 6.12.8-arch1-1 (linux@archlinux) (gcc (GCC) 14.2.1 20240910, >> GNU ld (GNU Binutils) 2.43.0) #1 SMP PREEMPT_DYNAMIC Thu, 02 Jan 2025 >> 22:52:26 +0000 >> $ ovs-vswitchd --version >> ovs-vswitchd (Open vSwitch) 3.4.1 >> >> [2] >> >> # inside the VM >> ovs-vsctl add-br br0 >> ip l set eth1 up >> ip l set eth2 up >> ovs-vsctl add-bond br0 bond0 eth1 eth2 >> ip l set br0 up >> ip addr add 10.2.1.104/16 dev br0 >> >> [3] >> >> # inside the VM >> iperf -s& >> while true; >> do >> PYTHONPATH=/usr/share/openvswitch/python/ ovs-tcpdump -env -i bond0 tcp >> port 12345 & >> sleep 2 >> pid=$(pidof tcpdump) >> echo pid: $pid >> tcpdump -envi mibond0 tcp port 12345 & >> sleep 5 >> echo kill >> kill -INT $pid >> sleep 3 >> done >> >> [4] >> >> # from outside >> iperf -c 10.2.1.104 -i1 -t 600 -P128 >> >> [5] >> >> Jan 08 09:01:57 arch-ovs ovs-vswitchd[446]: ovs|00074|bridge|INFO|bridge >> br0: added interface mibond0 on port 13 >> Jan 08 09:01:57 arch-ovs ovs-vswitchd[446]: >> 2025-01-08T09:01:57Z|00074|bridge|INFO|bridge br0: added interface mibond0 >> on port 13 >> Jan 08 09:01:57 arch-ovs kernel: mibond0: entered promiscuous mode >> Jan 08 09:01:57 arch-ovs kernel: audit: type=1700 audit(1736326917.773:304): >> dev=mibond0 prom=256 old_prom=0 auid=4294967295 uid=0 gid=0 ses=4294967295 >> Jan 08 09:01:57 arch-ovs kernel: audit: type=1300 audit(1736326917.773:304): >> arch=c000003e syscall=46 success=yes exit=52 a0=f a1=7ffcbc8bb550 a2=0 >> a3=55ab1849dd00 items=0 ppid=1 pid=446 auid=4294967295 uid=0 gid=0 euid=0 >> suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 >> comm="ovs-vswitchd" exe="/usr/bin/ovs-vswitchd" key=(null) >> Jan 08 09:01:57 arch-ovs kernel: audit: type=1327 audit(1736326917.773:304): >> proctitle=2F7573722F7362696E2F6F76732D7673776974636864002D2D70696466696C653D2F72756E2F6F70656E767377697463682F6F76732D76737769746368642E706964 >> Jan 08 09:02:04 arch-ovs systemd-networkd[479]: mibond0: Link DOWN >> Jan 08 09:02:04 arch-ovs kernel: mibond0 (unregistering): left promiscuous >> mode >> Jan 08 09:02:04 arch-ovs kernel: audit: type=1700 audit(1736326924.733:305): >> dev=mibond0 prom=0 old_prom=256 auid=1000 uid=0 gid=0 ses=3 >> Jan 08 09:02:04 arch-ovs kernel: mibond0 selects TX queue 0, but real number >> of TX queues is 0 >> Jan 08 09:02:04 arch-ovs audit: ANOM_PROMISCUOUS dev=mibond0 prom=0 >> old_prom=256 auid=1000 uid=0 gid=0 ses=3 >> Jan 08 09:02:04 arch-ovs systemd-networkd[479]: mibond0: Lost carrier >> Jan 08 09:02:30 arch-ovs kernel: watchdog: BUG: soft lockup - CPU#1 stuck >> for 26s! [swapper/1:0] >> Jan 08 09:02:30 arch-ovs kernel: CPU#1 Utilization every 4s during lockup: >> Jan 08 09:02:30 arch-ovs kernel: #1: 0% system, 100% >> softirq, 0% hardirq, 0% idle >> Jan 08 09:02:30 arch-ovs kernel: #2: 0% system, 101% >> softirq, 0% hardirq, 0% idle >> Jan 08 09:02:30 arch-ovs kernel: #3: 0% system, 100% >> softirq, 0% hardirq, 0% idle >> Jan 08 09:02:30 arch-ovs kernel: #4: 0% system, 100% >> softirq, 1% hardirq, 0% idle >> Jan 08 09:02:30 arch-ovs kernel: #5: 0% system, 101% >> softirq, 0% hardirq, 0% idle >> Jan 08 09:02:30 arch-ovs kernel: Modules linked in: dummy cfg80211 rfkill >> isofs nfnetlink_cttimeout openvswitch nsh nf_conncount nf_nat nf_conntrack >> nf_defrag_ipv6 nf_defrag_ipv4 psample vfat fat sha512_ssse3 sha1_ssse3 >> aesni_intel gf128mul crypto_simd cryptd psmouse pcspkr i2c_piix4 joydev >> i2c_smbus mousedev mac_hid loop dm_mod nfnetlink vsock_loopback >> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci >> qemu_fw_cfg ip_tables x_tables btrfs hid_generic blake2b_generic libcrc32c >> usbhid crc32c_generic xor raid6_pq sr_mod cdrom bochs serio_raw ata_generic >> drm_vram_helper atkbd pata_acpi drm_ttm_helper libps2 crc32c_intel >> vivaldi_fmap intel_agp ttm sha256_ssse3 ata_piix virtio_net virtio_balloon >> net_failover failover virtio_scsi intel_gtt i8042 floppy serio >> Jan 08 09:02:30 arch-ovs kernel: CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not >> tainted 6.12.8-arch1-1 #1 099de49ddaebb26408f097c48b36e50b2c8e21c9 >> Jan 08 09:02:30 arch-ovs kernel: Hardware name: QEMU Standard PC (i440FX + >> PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 >> Jan 08 09:02:30 arch-ovs kernel: RIP: 0010:netdev_pick_tx+0x267/0x2b0 >> Jan 08 09:02:30 arch-ovs kernel: Code: c2 48 c1 e8 20 44 01 f0 44 0f b7 f0 >> e9 c7 fe ff ff e8 ad 7d 5d ff 44 0f b6 04 24 e9 08 fe ff ff 45 89 c8 e9 6e >> fe ff ff 29 c2 <39> c2 72 89 eb f8 48 85 db 41 bf ff ff ff ff 49 0f 44 dc e9 >> dd fd >> Jan 08 09:02:30 arch-ovs kernel: RSP: 0018:ffffa32280120648 EFLAGS: 00000246 >> Jan 08 09:02:30 arch-ovs kernel: RAX: 0000000000000000 RBX: ffff96d343cd2000 >> RCX: 00000000000005a8 >> Jan 08 09:02:30 arch-ovs kernel: RDX: 0000000000000000 RSI: ffff96d342208900 >> RDI: ffff96d340364c80 >> Jan 08 09:02:30 arch-ovs kernel: RBP: ffff96d342208900 R08: 0000000000000000 >> R09: 0000000000000000 >> Jan 08 09:02:30 arch-ovs kernel: R10: ffff96d342208900 R11: ffffa322801208b0 >> R12: ffff96d343cd2000 >> Jan 08 09:02:30 arch-ovs kernel: R13: 0000000000000000 R14: 0000000000000000 >> R15: 00000000ffffffff >> Jan 08 09:02:30 arch-ovs kernel: FS: 0000000000000000(0000) >> GS:ffff96d477c80000(0000) knlGS:0000000000000000 >> Jan 08 09:02:30 arch-ovs kernel: CS: 0010 DS: 0000 ES: 0000 CR0: >> 0000000080050033 >> Jan 08 09:02:30 arch-ovs kernel: CR2: 000071216853a120 CR3: 0000000131ce4000 >> CR4: 00000000000006f0 >> Jan 08 09:02:30 arch-ovs kernel: Call Trace: >> Jan 08 09:02:30 arch-ovs kernel: <IRQ> >> Jan 08 09:02:30 arch-ovs kernel: ? watchdog_timer_fn.cold+0x19c/0x219 >> Jan 08 09:02:30 arch-ovs kernel: ? __pfx_watchdog_timer_fn+0x10/0x10 >> Jan 08 09:02:30 arch-ovs kernel: ? __hrtimer_run_queues+0x132/0x2a0 >> Jan 08 09:02:30 arch-ovs kernel: ? hrtimer_interrupt+0xfa/0x210 >> Jan 08 09:02:30 arch-ovs kernel: ? __sysvec_apic_timer_interrupt+0x55/0x100 >> Jan 08 09:02:30 arch-ovs kernel: ? sysvec_apic_timer_interrupt+0x38/0x90 >> Jan 08 09:02:30 arch-ovs kernel: ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 >> Jan 08 09:02:30 arch-ovs kernel: ? netdev_pick_tx+0x267/0x2b0 >> Jan 08 09:02:30 arch-ovs kernel: ? netdev_pick_tx+0x253/0x2b0 >> Jan 08 09:02:30 arch-ovs kernel: netdev_core_pick_tx+0xa1/0xb0 >> Jan 08 09:02:30 arch-ovs kernel: __dev_queue_xmit+0x19d/0xe70 >> Jan 08 09:02:30 arch-ovs kernel: ? kmem_cache_alloc_noprof+0x111/0x2f0 >> Jan 08 09:02:30 arch-ovs kernel: do_execute_actions+0xce/0x1b70 >> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c] >> Jan 08 09:02:30 arch-ovs kernel: ? flow_lookup.isra.0+0x58/0x100 >> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c] >> Jan 08 09:02:30 arch-ovs kernel: ovs_execute_actions+0x4c/0x130 >> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c] >> Jan 08 09:02:30 arch-ovs kernel: ovs_dp_process_packet+0xa6/0x220 >> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c] >> Jan 08 09:02:30 arch-ovs kernel: ? __pfx_netdev_frame_hook+0x10/0x10 >> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c] >> Jan 08 09:02:30 arch-ovs kernel: ? __pfx_netdev_frame_hook+0x10/0x10 >> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c] >> Jan 08 09:02:30 arch-ovs kernel: ovs_vport_receive+0x84/0xe0 [openvswitch >> d139b1adcdbcdfb64274f88696adfd125a3e2f3c] >> Jan 08 09:02:30 arch-ovs kernel: netdev_frame_hook+0xd9/0x1a0 [openvswitch >> d139b1adcdbcdfb64274f88696adfd125a3e2f3c] >> Jan 08 09:02:30 arch-ovs kernel: >> __netif_receive_skb_core.constprop.0+0x1fa/0x10b0 >> Jan 08 09:02:30 arch-ovs kernel: __netif_receive_skb_list_core+0x15d/0x300 >> Jan 08 09:02:30 arch-ovs kernel: netif_receive_skb_list_internal+0x1d4/0x310 >> Jan 08 09:02:30 arch-ovs kernel: napi_complete_done+0x72/0x220 >> Jan 08 09:02:30 arch-ovs kernel: virtnet_poll+0x6da/0xe62 [virtio_net >> ba458d10bdb47849f4b70ed392bbaae27b08be62] >> Jan 08 09:02:30 arch-ovs kernel: ? free_unref_page_commit+0x169/0x2e0 >> Jan 08 09:02:30 arch-ovs kernel: ? enqueue_hrtimer+0x35/0x90 >> Jan 08 09:02:30 arch-ovs kernel: __napi_poll+0x2b/0x160 >> Jan 08 09:02:30 arch-ovs kernel: net_rx_action+0x349/0x3e0 >> Jan 08 09:02:30 arch-ovs kernel: handle_softirqs+0xe4/0x2a0 >> Jan 08 09:02:30 arch-ovs kernel: __irq_exit_rcu+0x97/0xb0 >> Jan 08 09:02:30 arch-ovs kernel: common_interrupt+0x85/0xa0 >> Jan 08 09:02:30 arch-ovs kernel: </IRQ> >> Jan 08 09:02:30 arch-ovs kernel: <TASK> >> Jan 08 09:02:30 arch-ovs kernel: asm_common_interrupt+0x26/0x40 >> Jan 08 09:02:30 arch-ovs kernel: RIP: >> 0010:finish_task_switch.isra.0+0x9f/0x2e0 >> Jan 08 09:02:30 arch-ovs kernel: Code: 34 00 00 00 00 0f 1f 44 00 00 4c 8b >> bb d8 0c 00 00 4d 85 ff 0f 85 b7 00 00 00 66 90 48 89 df e8 27 e8 db 00 fb >> 0f 1f 44 00 00 <66> 90 4d 85 ed 74 18 4d 3b ae 40 0a 00 00 0f 84 50 01 00 00 >> f0 41 >> Jan 08 09:02:30 arch-ovs kernel: RSP: 0018:ffffa322800d3e18 EFLAGS: 00000282 >> Jan 08 09:02:30 arch-ovs kernel: RAX: 0000000000000001 RBX: ffff96d477cb6c80 >> RCX: 0000000000000002 >> Jan 08 09:02:30 arch-ovs kernel: RDX: 0000000000000000 RSI: 0000000000000001 >> RDI: ffff96d477cb6c80 >> Jan 08 09:02:30 arch-ovs kernel: RBP: ffffa322800d3e48 R08: 0000000000000001 >> R09: 0000000000000000 >> Jan 08 09:02:30 arch-ovs kernel: R10: 0000000000000001 R11: 0000000000000000 >> R12: ffff96d350b04c80 >> Jan 08 09:02:30 arch-ovs kernel: R13: 0000000000000000 R14: ffff96d340364c80 >> R15: 0000000000000000 >> Jan 08 09:02:30 arch-ovs kernel: ? finish_task_switch.isra.0+0x99/0x2e0 >> Jan 08 09:02:30 arch-ovs kernel: __schedule+0x3b8/0x12b0 >> Jan 08 09:02:30 arch-ovs kernel: ? pv_native_safe_halt+0xf/0x20 >> Jan 08 09:02:30 arch-ovs kernel: schedule_idle+0x23/0x40 >> Jan 08 09:02:30 arch-ovs kernel: cpu_startup_entry+0x29/0x30 >> Jan 08 09:02:30 arch-ovs kernel: start_secondary+0x11e/0x140 >> Jan 08 09:02:30 arch-ovs kernel: common_startup_64+0x13e/0x141 >> Jan 08 09:02:30 arch-ovs kernel: </TASK> >> >> [6] https://lore.kernel.org/lkml/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug/ >> >> _______________________________________________ >> discuss mailing list >> disc...@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >> > _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss