Waiman, On Sun, Mar 28 2021 at 20:52, Waiman Long wrote: > It was found that the following circular locking dependency warning > could happen in some systems: > > [ 218.097878] ====================================================== > [ 218.097879] WARNING: possible circular locking dependency detected > [ 218.097880] 4.18.0-228.el8.x86_64+debug #1 Not tainted
Reports have to be against latest mainline and not against the random distro frankenkernel of the day. That's nothing new. Plus I was asking you to provide a full splat to look at so this can be discussed _upfront_. Oh well... > [ 218.097914] -> #2 (&irq_desc_lock_class){-.-.}: > [ 218.097917] _raw_spin_lock_irqsave+0x48/0x81 > [ 218.097918] __irq_get_desc_lock+0xcf/0x140 > [ 218.097919] __dble_irq_nosync+0x6e/0x110 This function does not even exist in mainline and never existed... > [ 218.097967] > [ 218.097967] Chain exists of: > [ 218.097968] console_oc_lock_class --> vector_lock > [ 218.097972] > [ 218.097973] Possible unsafe locking scenario: > [ 218.097973] > [ 218.097974] CPU0 CPU1 > [ 218.097975] ---- ---- > [ 218.097975] lock(vector_lock); > [ 218.097977] lock(&irq_desc_lock_class); > [ 218.097980] lock(vector_lock); > [ 218.097981] lock(console_owner); > [ 218.097983] > [ 218.097984] *** DEADLOCK *** > [ 218.097984] > [ 218.097985] 6 locks held by systemd/1: > [ 218.097986] #0: ffff88822b5cc1e8 (&tty->legacy_mutex){+.+.}, at: > tty_init_dev+0x79/0x440 > [ 218.097989] #1: ffff88832ee00770 (&port->mutex){+.+.}, at: > tty_port_open+0x85/0x190 > [ 218.097993] #2: ffff88813be85a88 (&desc->request_mutex){+.+.}, at: > __setup_irq+0x249/0x1e60 > [ 218.097996] #3: ffff88813be858c0 (&irq_desc_lock_class){-.-.}, at: > __setup_irq+0x2d9/0x1e60 > [ 218.098000] #4: ffffffff84afca78 (vector_lock){-.-.}, at: > x86_vector_activate+0xca/0xab0 > [ 218.098003] #5: ffffffff84c27e20 (console_lock){+.+.}, at: > vprintk_emit+0x13a/0x450 This is a more fundamental problem than just vector lock and the same problem exists with any other printk over serial which is nested in the interrupt activation chain not only on X86. > -static int activate_reserved(struct irq_data *irqd) > +static int activate_reserved(struct irq_data *irqd, char *wbuf, size_t wsize) > { ... > if (!cpumask_subset(irq_data_get_effective_affinity_mask(irqd), > irq_data_get_affinity_mask(irqd))) { > - pr_warn("irq %u: Affinity broken due to vector space > exhaustion.\n", > - irqd->irq); > + snprintf(wbuf, wsize, KERN_WARNING > + "irq %u: Affinity broken due to vector space > exhaustion.\n", > + irqd->irq); This is not really any more tasteful than the previous one and it does not fix the fundamental underlying problem. But, because I'm curious and printk is a constant source of trouble, I just added unconditional pr_warns into those functions under vector_lock on 5.12-rc5. Still waiting for the lockdep splat to show up while enjoying the trickle of printks over serial. If you really think this is an upstream problem then please provide a corresponding lockdep splat on plain 5.12-rc5 along with a .config and the scenario which triggers this. Not less, not more. Thanks, tglx