On 03/07/2024 17:26, Johannes Berg wrote:
Lockdep reports a spinlock ordering problem: sometimes we take head_lock
first, sometimes tail_lock, so there's a classic ABBA deadlock possible.
It may not happen now because of UML being single-CPU and all that, but
perhaps with preempt?

You need both locks only when you updating the queue depth. That was not an issue without preempt. Looking at the trace this is now an issue, so it will need to be sorted out.

The way the locks work is - head_lock guards head, tail_lock guards tail and head/tail run around and wrap.

If increment and decrement of the queue_depth is using atomics you do not need to grab both locks. One lock + atomic sub/add will suffice.


Report:

======================================================
WARNING: possible circular locking dependency detected
6.10.0-rc6-00036-gcd01672d64a3 #26 Not tainted
------------------------------------------------------
systemd-network/105 is trying to acquire lock:
00000000622bd8b0 (&result->tail_lock){+...}-{2:2}, at: vector_send+0x1c5/0x266

but task is already holding lock:
00000000622bd870 (&result->head_lock){+...}-{2:2}, at: vector_send+0x3b/0x266

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&result->head_lock){+...}-{2:2}:
        save_stack_trace+0x32/0x34
        stack_trace_save+0x38/0x3d
        save_trace+0x90/0x268
        __lock_acquire+0x146c/0x15a4
        lock_acquire+0x2e1/0x38d
        _raw_spin_lock+0x43/0x5a
        vector_net_start_xmit+0x1ef/0x3bd
        netdev_start_xmit+0x51/0x80
        dev_hard_start_xmit+0x137/0x22f
        sch_direct_xmit+0xc6/0x2a3
        __dev_queue_xmit+0x3da/0x877
        packet_xmit+0x3c/0xf2
        packet_sendmsg+0x106a/0x10d0
        sock_sendmsg_nosec+0x1c/0x98
        __sys_sendto+0x113/0x147
        sys_sendto+0x14/0x18
        handle_syscall+0xa8/0xd8
        userspace+0x323/0x4e5
        fork_handler+0x8c/0x8e

-> #0 (&result->tail_lock){+...}-{2:2}:
        save_stack_trace+0x32/0x34
        stack_trace_save+0x38/0x3d
        save_trace+0x90/0x268
        print_circular_bug+0x68/0x2b5
        check_noncircular+0x11c/0x12c
        __lock_acquire+0x12a4/0x15a4
        lock_acquire+0x2e1/0x38d
        _raw_spin_lock+0x43/0x5a
        vector_send+0x1c5/0x266
        vector_net_start_xmit+0x394/0x3bd
        netdev_start_xmit+0x51/0x80
        dev_hard_start_xmit+0x137/0x22f
        sch_direct_xmit+0xc6/0x2a3
        __dev_queue_xmit+0x3da/0x877
        packet_xmit+0x3c/0xf2
        packet_sendmsg+0x106a/0x10d0
        sock_sendmsg_nosec+0x1c/0x98
        __sys_sendto+0x113/0x147
        sys_sendto+0x14/0x18
        handle_syscall+0xa8/0xd8
        userspace+0x323/0x4e5
        fork_handler+0x8c/0x8e

other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&result->head_lock);
                                lock(&result->tail_lock);
                                lock(&result->head_lock);
   lock(&result->tail_lock);

  *** DEADLOCK ***

4 locks held by systemd-network/105:
  #0: 0000000060749920 (rcu_read_lock_bh){....}-{1:2}, at: 
rcu_lock_acquire+0x16/0x30
  #1: 0000000066a47210 (dev->qdisc_tx_busylock ?: 
&qdisc_tx_busylock){+...}-{2:2}, at: qdisc_run_begin+0x2e/0x66
  #2: 000000006227f690 (_xmit_ETHER#2){+...}-{2:2}, at: 
sch_direct_xmit+0x9b/0x2a3
  #3: 00000000622bd870 (&result->head_lock){+...}-{2:2}, at: 
vector_send+0x3b/0x266

stack backtrace:
CPU: 0 PID: 105 Comm: systemd-network Not tainted 
6.10.0-rc6-00036-gcd01672d64a3 #26
Stack:
  812c7710 604c0e3b 000000ea 00000000
  ffffff00 6064f6f1 604de3a0 6063beda
  812c7740 604e8b7c 000000ec 604df56f
Call Trace:
  [<604df56f>] ? _printk+0x0/0x98
  [<6002b386>] show_stack+0x141/0x150
  [<604c0e3b>] ? dump_stack_print_info+0xe2/0xf1
  [<604de3a0>] ? __print_lock_name+0x0/0xbd
  [<604e8b7c>] dump_stack_lvl+0x71/0xb8
  [<604df56f>] ? _printk+0x0/0x98
  [<604e8be1>] dump_stack+0x1e/0x20
  [<6009348a>] print_circular_bug+0x2a6/0x2b5
  [<600c8e7d>] ? stack_trace_save+0x38/0x3d
  [<60090fbf>] ? save_trace+0x90/0x268
  [<600935b5>] check_noncircular+0x11c/0x12c
  [<600acb42>] ? rcu_read_lock_any_held+0x40/0x9c
  [<604ea4ec>] ? __this_cpu_preempt_check+0x14/0x16
  [<60091cc6>] ? lookup_chain_cache+0x4e/0x130
  [<60090d22>] ? hlock_class+0x0/0xa3
  [<600963be>] __lock_acquire+0x12a4/0x15a4
  [<604ea4d8>] ? __this_cpu_preempt_check+0x0/0x16
  [<60095034>] lock_acquire+0x2e1/0x38d
  [<60034e49>] ? vector_send+0x1c5/0x266
  [<604e9cf7>] ? debug_lockdep_rcu_enabled+0x0/0x3f
  [<604e9cf7>] ? debug_lockdep_rcu_enabled+0x0/0x3f
  [<604f2532>] _raw_spin_lock+0x43/0x5a
  [<60034e49>] ? vector_send+0x1c5/0x266
  [<604f24ef>] ? _raw_spin_lock+0x0/0x5a
  [<60034e49>] vector_send+0x1c5/0x266
  [<604f28bb>] ? _raw_spin_unlock+0x0/0x7c
  [<6003538c>] vector_net_start_xmit+0x394/0x3bd
  [<604ea4d8>] ? __this_cpu_preempt_check+0x0/0x16
  [<603d0711>] netdev_start_xmit+0x51/0x80
  [<600b09cd>] ? rcu_is_watching+0x0/0x6f
  [<603d79f9>] dev_hard_start_xmit+0x137/0x22f
  [<6041a35b>] sch_direct_xmit+0xc6/0x2a3
  [<603d15a7>] ? qdisc_run_end+0x0/0x52
  [<603d7f6b>] __dev_queue_xmit+0x3da/0x877
  [<603b3f55>] ? skb_set_owner_w+0x96/0x9b
  [<604ba1b5>] packet_xmit+0x3c/0xf2
  [<604bfbc2>] packet_sendmsg+0x106a/0x10d0
  [<6002e30a>] ? copy_chunk_from_user+0x26/0x30
  [<6002e2e4>] ? copy_chunk_from_user+0x0/0x30
  [<6002e592>] ? do_op_one_page+0x9d/0xa9
  [<60150008>] ? __gup_longterm_locked+0x39b/0x618
  [<603af0f0>] sock_sendmsg_nosec+0x1c/0x98
  [<603b13ca>] __sys_sendto+0x113/0x147
  [<600f8264>] ? __seccomp_filter+0xa9/0x3c7
  [<6004682f>] ? switch_threads+0x28/0x57
  [<603b1412>] sys_sendto+0x14/0x18
  [<6002e252>] handle_syscall+0xa8/0xd8
  [<60046631>] userspace+0x323/0x4e5
  [<6002a494>] ? interrupt_end+0x0/0xe4
  [<6002a3c3>] fork_handler+0x8c/0x8e




--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/


Reply via email to