On Thu, Dec 06, 2018 at 05:11:41PM -0800, Bart Van Assche wrote: > + if (WARN_ON_ONCE(!hlock_class(prev)->hash_entry.pprev) || > + WARN_ONCE(!hlock_class(next)->hash_entry.pprev, > + KERN_INFO "Detected use-after-free of lock class %s\n", > + hlock_class(next)->name)) { > + return 2; > + }
Ah, this is that UaF on ->name, but it only happens when there's already been a UaF, so that's fine I suppose. Still a note on that earlier Changelog would've been nice I suppose. > +/* Must be called with the graph lock held. */ > +static void remove_class_from_lock_chain(struct lock_chain *chain, > + struct lock_class *class) > +{ > + u64 chain_key; > + int i; > + > +#ifdef CONFIG_PROVE_LOCKING > + for (i = chain->base; i < chain->base + chain->depth; i++) { > + if (chain_hlocks[i] != class - lock_classes) > + continue; > + if (--chain->depth > 0) { > + memmove(&chain_hlocks[i], &chain_hlocks[i + 1], > + (chain->base + chain->depth - i) * > + sizeof(chain_hlocks[0])); } Also, I suppose a comment here that notes we 'leak' chain_hlock[] entries would be appropriate here. If Waiman cares, it is possible to reclaim then by extending the above memmove() to cover the _entire_ tail of the array and then going around and fixing up all the chain->base 'pointers' that are in the moved part. > + /* > + * Each lock class occurs at most once in a > + * lock chain so once we found a match we can > + * break out of this loop. > + */ > + goto recalc; > + } > + /* Since the chain has not been modified, return. */ > + return; > + > +recalc: > + /* > + * Note: calling hlist_del_rcu() from inside a > + * hlist_for_each_entry_rcu() loop is safe. > + */ > + if (chain->depth == 0) { > + /* To do: decrease chain count. See also inc_chains(). */ > + hlist_del_rcu(&chain->entry); > + return; > + } > + chain_key = 0; > + for (i = chain->base; i < chain->base + chain->depth; i++) > + chain_key = iterate_chain_key(chain_key, chain_hlocks[i] + 1); > + if (chain->chain_key == chain_key) > + return; > + hlist_del_rcu(&chain->entry); > + chain->chain_key = chain_key; > + hlist_add_head_rcu(&chain->entry, chainhashentry(chain_key)); I think this is broken; while hlist_del_rcu() and hlist_add_head_rcu() are individually safe against concurrent hlist_for_each_entry_rcu(), they cannot be used consecutively like this. The thing is, hlist_del_rcu() preserves the ->next pointer such that a concurrent hlist_for_each_entry_rcu() that happens to be at that entry will be able to continue. But your hlist_add_head_rcu() will overwrite that, so the concurrent iterator, which started on one list will continue on another. There must be an RCU-GP between del_rcu and add_rcu such that the above situation cannot happen. > +/* > + * Free all lock classes that are on the zapped_classes list. Called as an > + * RCU callback function. > + */ > +static void free_zapped_classes(struct callback_head *ch) > +{ > + struct lock_class *class; > + unsigned long flags; > + int locked; > + > + raw_local_irq_save(flags); > + locked = graph_lock(); If this fails; you must not touch any of the state. You don't hold the lock after all. > + rcu_callback_scheduled = false; > + list_for_each_entry(class, &zapped_classes, lock_entry) { > + reinit_class(class); > + nr_lock_classes--; > + } > + list_splice_init(&zapped_classes, &free_lock_classes); > + if (locked) > + graph_unlock(); > + raw_local_irq_restore(flags); > +} > + > +/* Must be called with the graph lock held. */ > +static void schedule_free_zapped_classes(void) > +{ > + if (rcu_callback_scheduled) > + return; > + rcu_callback_scheduled = true; > + call_rcu(&free_zapped_classes_rcu_head, free_zapped_classes); > +} But I fear this is fundamentally broken too. In exactly the manner I tried to explain earlier. (sorry, wide(r) terminal required) It starts out good: CPU0 CPU1 CPU2 rcu_read_lock() // use class zap_class() schedule_free_zapped_classes(); call_rcu(); rcu_read_lock() /* since none of the rcu_read_lock() * sections we started out with are * still active; this is where the * callback _can_ happen */ But then, another user and and removal come in: rcu_read_lock(); // use class zap_class() list_add(&entry, &zapped_classes) schedule_free_zapped_classes() /* no-op, still pending */ free_zapped_classes() list_splice(&zapped_classes, &free_lock_classes) *whoops* we just splice'd a class onto the free list that is still in use !! rcu_read_unlock() There's two possible solutions: - the second schedule_free_zapped_classes() must somehow ensure the call_rcu() callback re-queues itself and waits for yet another GP, or - you must add classes zapped after the first invocation to a second list.