On 1/9/25 12:04, Ilya Maximets wrote:
> On 1/9/25 10:06, Friedrich Weber via discuss wrote:
>> Hi,
>>
>> (first reported this via other channels and was asked to repost here)
>>
>> One of our Proxmox VE users reported that when hitting CTRL+C on a running
>> `ovs-tcpdump` on a host with VM workloads, there is a low (<1/50) chance of a
>> complete OVS network freeze (i.e., the host loses network connection on all
>> OVS-controlled interfaces). Soft lockups are logged, and they need to
>> hard-reset the host.
>>
>> I believe I can reproduce the same issue in an Arch Linux VM (specs [1]) with
>> the following steps:
>>
>> - set up an OVS bridge with an active-backup bond0 and assign an IP [2]
>> - install iperf2 and run a script [3] that spawns an iperf server, then
>>   repeatedly performs the following:
>>   1. start ovs-tcpdump on bond0, which spawns an "inner" tcpdump
>>   2. concurrently, start ordinary tcpdump on mibond0 (the interface created 
>> by
>>      ovs-tcpdump)
>>   3. send SIGINT to the inner tcpdump from step 1. The tcpdump from step 2 
>> exits
>>      with "tcpdump: pcap_loop: The interface disappeared".
>> - run long-running parallel iperf2 against bond0 from outside [4]. I'm 
>> running
>>   the iperf from the hypervisor, achieving a cumulative bandwidth of 
>> 55-60Gbit/s.
>> - after 5-10min the host usually becomes unreachable via the bond (iperf 
>> reports
>>   zero bandwith), and a soft lockup is logged in the host [5]
>>
>> Note that the user who originally reported this only starts one ovs-tcpdump
>> process -- so there is probably some other unidentified factor that makes the
>> issue more likely to trigger on their host.
>>
>> The symptoms sound similar to the ones described in the kernel patch "net:
>> openvswitch: fix race on port output" [6], but as far as I can tell, this 
>> patch
>> is already contained in 6.12.
>>
> 
> Thanks, Friedrich for the report and posting here!
> 
> As previously discussed, it is the same issue as the patch [6] attempted to 
> fix,
> but it wasn't fixed fully, because checking netif_carrier_ok() is not enough
> for a dummy netdev that we use for ovs-tcpdump.
> 
> The dummy network device doesn't implement ndo_stop callback and so the 
> carrier
> status is not turned off while stopping the device.  We should also check that
> device is running.
> 
> I'll run a few more tests and send out the fix.

Sent: 
https://lore.kernel.org/netdev/20250109122225.4034688-1-i.maxim...@ovn.org/

> 
> Best regards, Ilya Maximets.
> 
>> Thanks,
>>
>> Friedrich
>>
>> [1]
>>
>> - Hypervisor is Proxmox VE 8.3 (QEMU/KVM)
>> - VM has 4 cores, 8G RAM, 3x virtio-net NICs (one for management, two for
>>   bond0)
>> - VM is running Arch Linux with kernel 6.12.8.arch1-1 (but I can also 
>> reproduce
>>   with a build of 6.13rc6), and openvswitch 3.4.1 (custom built package):
>>
>> $ cat /proc/version
>> Linux version 6.12.8-arch1-1 (linux@archlinux) (gcc (GCC) 14.2.1 20240910, 
>> GNU ld (GNU Binutils) 2.43.0) #1 SMP PREEMPT_DYNAMIC Thu, 02 Jan 2025 
>> 22:52:26 +0000
>> $ ovs-vswitchd --version
>> ovs-vswitchd (Open vSwitch) 3.4.1
>>
>> [2]
>>
>> # inside the VM
>> ovs-vsctl add-br br0
>> ip l set eth1 up
>> ip l set eth2 up
>> ovs-vsctl add-bond br0 bond0 eth1 eth2
>> ip l set br0 up
>> ip addr add 10.2.1.104/16 dev br0
>>
>> [3]
>>
>> # inside the VM
>> iperf -s&
>> while true;
>> do
>>      PYTHONPATH=/usr/share/openvswitch/python/ ovs-tcpdump -env -i bond0 tcp 
>> port 12345 &
>>      sleep 2
>>      pid=$(pidof tcpdump)
>>      echo pid: $pid
>>      tcpdump -envi mibond0 tcp port 12345 &
>>      sleep 5
>>      echo kill
>>      kill -INT $pid
>>      sleep 3
>> done
>>
>> [4]
>>
>> # from outside
>> iperf -c 10.2.1.104 -i1 -t 600 -P128
>>
>> [5]
>>
>> Jan 08 09:01:57 arch-ovs ovs-vswitchd[446]: ovs|00074|bridge|INFO|bridge 
>> br0: added interface mibond0 on port 13
>> Jan 08 09:01:57 arch-ovs ovs-vswitchd[446]: 
>> 2025-01-08T09:01:57Z|00074|bridge|INFO|bridge br0: added interface mibond0 
>> on port 13
>> Jan 08 09:01:57 arch-ovs kernel: mibond0: entered promiscuous mode
>> Jan 08 09:01:57 arch-ovs kernel: audit: type=1700 audit(1736326917.773:304): 
>> dev=mibond0 prom=256 old_prom=0 auid=4294967295 uid=0 gid=0 ses=4294967295
>> Jan 08 09:01:57 arch-ovs kernel: audit: type=1300 audit(1736326917.773:304): 
>> arch=c000003e syscall=46 success=yes exit=52 a0=f a1=7ffcbc8bb550 a2=0 
>> a3=55ab1849dd00 items=0 ppid=1 pid=446 auid=4294967295 uid=0 gid=0 euid=0 
>> suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 
>> comm="ovs-vswitchd" exe="/usr/bin/ovs-vswitchd" key=(null)
>> Jan 08 09:01:57 arch-ovs kernel: audit: type=1327 audit(1736326917.773:304): 
>> proctitle=2F7573722F7362696E2F6F76732D7673776974636864002D2D70696466696C653D2F72756E2F6F70656E767377697463682F6F76732D76737769746368642E706964
>> Jan 08 09:02:04 arch-ovs systemd-networkd[479]: mibond0: Link DOWN
>> Jan 08 09:02:04 arch-ovs kernel: mibond0 (unregistering): left promiscuous 
>> mode
>> Jan 08 09:02:04 arch-ovs kernel: audit: type=1700 audit(1736326924.733:305): 
>> dev=mibond0 prom=0 old_prom=256 auid=1000 uid=0 gid=0 ses=3
>> Jan 08 09:02:04 arch-ovs kernel: mibond0 selects TX queue 0, but real number 
>> of TX queues is 0
>> Jan 08 09:02:04 arch-ovs audit: ANOM_PROMISCUOUS dev=mibond0 prom=0 
>> old_prom=256 auid=1000 uid=0 gid=0 ses=3
>> Jan 08 09:02:04 arch-ovs systemd-networkd[479]: mibond0: Lost carrier
>> Jan 08 09:02:30 arch-ovs kernel: watchdog: BUG: soft lockup - CPU#1 stuck 
>> for 26s! [swapper/1:0]
>> Jan 08 09:02:30 arch-ovs kernel: CPU#1 Utilization every 4s during lockup:
>> Jan 08 09:02:30 arch-ovs kernel:         #1:   0% system,        100% 
>> softirq,          0% hardirq,          0% idle
>> Jan 08 09:02:30 arch-ovs kernel:         #2:   0% system,        101% 
>> softirq,          0% hardirq,          0% idle
>> Jan 08 09:02:30 arch-ovs kernel:         #3:   0% system,        100% 
>> softirq,          0% hardirq,          0% idle
>> Jan 08 09:02:30 arch-ovs kernel:         #4:   0% system,        100% 
>> softirq,          1% hardirq,          0% idle
>> Jan 08 09:02:30 arch-ovs kernel:         #5:   0% system,        101% 
>> softirq,          0% hardirq,          0% idle
>> Jan 08 09:02:30 arch-ovs kernel: Modules linked in: dummy cfg80211 rfkill 
>> isofs nfnetlink_cttimeout openvswitch nsh nf_conncount nf_nat nf_conntrack 
>> nf_defrag_ipv6 nf_defrag_ipv4 psample vfat fat sha512_ssse3 sha1_ssse3 
>> aesni_intel gf128mul crypto_simd cryptd psmouse pcspkr i2c_piix4 joydev 
>> i2c_smbus mousedev mac_hid loop dm_mod nfnetlink vsock_loopback 
>> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock vmw_vmci 
>> qemu_fw_cfg ip_tables x_tables btrfs hid_generic blake2b_generic libcrc32c 
>> usbhid crc32c_generic xor raid6_pq sr_mod cdrom bochs serio_raw ata_generic 
>> drm_vram_helper atkbd pata_acpi drm_ttm_helper libps2 crc32c_intel 
>> vivaldi_fmap intel_agp ttm sha256_ssse3 ata_piix virtio_net virtio_balloon 
>> net_failover failover virtio_scsi intel_gtt i8042 floppy serio
>> Jan 08 09:02:30 arch-ovs kernel: CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not 
>> tainted 6.12.8-arch1-1 #1 099de49ddaebb26408f097c48b36e50b2c8e21c9
>> Jan 08 09:02:30 arch-ovs kernel: Hardware name: QEMU Standard PC (i440FX + 
>> PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
>> Jan 08 09:02:30 arch-ovs kernel: RIP: 0010:netdev_pick_tx+0x267/0x2b0
>> Jan 08 09:02:30 arch-ovs kernel: Code: c2 48 c1 e8 20 44 01 f0 44 0f b7 f0 
>> e9 c7 fe ff ff e8 ad 7d 5d ff 44 0f b6 04 24 e9 08 fe ff ff 45 89 c8 e9 6e 
>> fe ff ff 29 c2 <39> c2 72 89 eb f8 48 85 db 41 bf ff ff ff ff 49 0f 44 dc e9 
>> dd fd
>> Jan 08 09:02:30 arch-ovs kernel: RSP: 0018:ffffa32280120648 EFLAGS: 00000246
>> Jan 08 09:02:30 arch-ovs kernel: RAX: 0000000000000000 RBX: ffff96d343cd2000 
>> RCX: 00000000000005a8
>> Jan 08 09:02:30 arch-ovs kernel: RDX: 0000000000000000 RSI: ffff96d342208900 
>> RDI: ffff96d340364c80
>> Jan 08 09:02:30 arch-ovs kernel: RBP: ffff96d342208900 R08: 0000000000000000 
>> R09: 0000000000000000
>> Jan 08 09:02:30 arch-ovs kernel: R10: ffff96d342208900 R11: ffffa322801208b0 
>> R12: ffff96d343cd2000
>> Jan 08 09:02:30 arch-ovs kernel: R13: 0000000000000000 R14: 0000000000000000 
>> R15: 00000000ffffffff
>> Jan 08 09:02:30 arch-ovs kernel: FS:  0000000000000000(0000) 
>> GS:ffff96d477c80000(0000) knlGS:0000000000000000
>> Jan 08 09:02:30 arch-ovs kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
>> 0000000080050033
>> Jan 08 09:02:30 arch-ovs kernel: CR2: 000071216853a120 CR3: 0000000131ce4000 
>> CR4: 00000000000006f0
>> Jan 08 09:02:30 arch-ovs kernel: Call Trace:
>> Jan 08 09:02:30 arch-ovs kernel:  <IRQ>
>> Jan 08 09:02:30 arch-ovs kernel:  ? watchdog_timer_fn.cold+0x19c/0x219
>> Jan 08 09:02:30 arch-ovs kernel:  ? __pfx_watchdog_timer_fn+0x10/0x10
>> Jan 08 09:02:30 arch-ovs kernel:  ? __hrtimer_run_queues+0x132/0x2a0
>> Jan 08 09:02:30 arch-ovs kernel:  ? hrtimer_interrupt+0xfa/0x210
>> Jan 08 09:02:30 arch-ovs kernel:  ? __sysvec_apic_timer_interrupt+0x55/0x100
>> Jan 08 09:02:30 arch-ovs kernel:  ? sysvec_apic_timer_interrupt+0x38/0x90
>> Jan 08 09:02:30 arch-ovs kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
>> Jan 08 09:02:30 arch-ovs kernel:  ? netdev_pick_tx+0x267/0x2b0
>> Jan 08 09:02:30 arch-ovs kernel:  ? netdev_pick_tx+0x253/0x2b0
>> Jan 08 09:02:30 arch-ovs kernel:  netdev_core_pick_tx+0xa1/0xb0
>> Jan 08 09:02:30 arch-ovs kernel:  __dev_queue_xmit+0x19d/0xe70
>> Jan 08 09:02:30 arch-ovs kernel:  ? kmem_cache_alloc_noprof+0x111/0x2f0
>> Jan 08 09:02:30 arch-ovs kernel:  do_execute_actions+0xce/0x1b70 
>> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
>> Jan 08 09:02:30 arch-ovs kernel:  ? flow_lookup.isra.0+0x58/0x100 
>> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
>> Jan 08 09:02:30 arch-ovs kernel:  ovs_execute_actions+0x4c/0x130 
>> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
>> Jan 08 09:02:30 arch-ovs kernel:  ovs_dp_process_packet+0xa6/0x220 
>> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
>> Jan 08 09:02:30 arch-ovs kernel:  ? __pfx_netdev_frame_hook+0x10/0x10 
>> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
>> Jan 08 09:02:30 arch-ovs kernel:  ? __pfx_netdev_frame_hook+0x10/0x10 
>> [openvswitch d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
>> Jan 08 09:02:30 arch-ovs kernel:  ovs_vport_receive+0x84/0xe0 [openvswitch 
>> d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
>> Jan 08 09:02:30 arch-ovs kernel:  netdev_frame_hook+0xd9/0x1a0 [openvswitch 
>> d139b1adcdbcdfb64274f88696adfd125a3e2f3c]
>> Jan 08 09:02:30 arch-ovs kernel:  
>> __netif_receive_skb_core.constprop.0+0x1fa/0x10b0
>> Jan 08 09:02:30 arch-ovs kernel:  __netif_receive_skb_list_core+0x15d/0x300
>> Jan 08 09:02:30 arch-ovs kernel:  netif_receive_skb_list_internal+0x1d4/0x310
>> Jan 08 09:02:30 arch-ovs kernel:  napi_complete_done+0x72/0x220
>> Jan 08 09:02:30 arch-ovs kernel:  virtnet_poll+0x6da/0xe62 [virtio_net 
>> ba458d10bdb47849f4b70ed392bbaae27b08be62]
>> Jan 08 09:02:30 arch-ovs kernel:  ? free_unref_page_commit+0x169/0x2e0
>> Jan 08 09:02:30 arch-ovs kernel:  ? enqueue_hrtimer+0x35/0x90
>> Jan 08 09:02:30 arch-ovs kernel:  __napi_poll+0x2b/0x160
>> Jan 08 09:02:30 arch-ovs kernel:  net_rx_action+0x349/0x3e0
>> Jan 08 09:02:30 arch-ovs kernel:  handle_softirqs+0xe4/0x2a0
>> Jan 08 09:02:30 arch-ovs kernel:  __irq_exit_rcu+0x97/0xb0
>> Jan 08 09:02:30 arch-ovs kernel:  common_interrupt+0x85/0xa0
>> Jan 08 09:02:30 arch-ovs kernel:  </IRQ>
>> Jan 08 09:02:30 arch-ovs kernel:  <TASK>
>> Jan 08 09:02:30 arch-ovs kernel:  asm_common_interrupt+0x26/0x40
>> Jan 08 09:02:30 arch-ovs kernel: RIP: 
>> 0010:finish_task_switch.isra.0+0x9f/0x2e0
>> Jan 08 09:02:30 arch-ovs kernel: Code: 34 00 00 00 00 0f 1f 44 00 00 4c 8b 
>> bb d8 0c 00 00 4d 85 ff 0f 85 b7 00 00 00 66 90 48 89 df e8 27 e8 db 00 fb 
>> 0f 1f 44 00 00 <66> 90 4d 85 ed 74 18 4d 3b ae 40 0a 00 00 0f 84 50 01 00 00 
>> f0 41
>> Jan 08 09:02:30 arch-ovs kernel: RSP: 0018:ffffa322800d3e18 EFLAGS: 00000282
>> Jan 08 09:02:30 arch-ovs kernel: RAX: 0000000000000001 RBX: ffff96d477cb6c80 
>> RCX: 0000000000000002
>> Jan 08 09:02:30 arch-ovs kernel: RDX: 0000000000000000 RSI: 0000000000000001 
>> RDI: ffff96d477cb6c80
>> Jan 08 09:02:30 arch-ovs kernel: RBP: ffffa322800d3e48 R08: 0000000000000001 
>> R09: 0000000000000000
>> Jan 08 09:02:30 arch-ovs kernel: R10: 0000000000000001 R11: 0000000000000000 
>> R12: ffff96d350b04c80
>> Jan 08 09:02:30 arch-ovs kernel: R13: 0000000000000000 R14: ffff96d340364c80 
>> R15: 0000000000000000
>> Jan 08 09:02:30 arch-ovs kernel:  ? finish_task_switch.isra.0+0x99/0x2e0
>> Jan 08 09:02:30 arch-ovs kernel:  __schedule+0x3b8/0x12b0
>> Jan 08 09:02:30 arch-ovs kernel:  ? pv_native_safe_halt+0xf/0x20
>> Jan 08 09:02:30 arch-ovs kernel:  schedule_idle+0x23/0x40
>> Jan 08 09:02:30 arch-ovs kernel:  cpu_startup_entry+0x29/0x30
>> Jan 08 09:02:30 arch-ovs kernel:  start_secondary+0x11e/0x140
>> Jan 08 09:02:30 arch-ovs kernel:  common_startup_64+0x13e/0x141
>> Jan 08 09:02:30 arch-ovs kernel:  </TASK>
>>
>> [6] https://lore.kernel.org/lkml/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug/
>>
>> _______________________________________________
>> discuss mailing list
>> disc...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>>
> 

_______________________________________________
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to