On 29/08/17 21:09, Eric Dumazet wrote: > On Tue, 2017-08-29 at 20:58 +0300, Nikolay Aleksandrov wrote: >> The commit below added a call to the ->destroy() callback for all qdiscs >> which failed in their ->init(), but some were not prepared for such >> change and can't handle partially initialized qdisc. HTB is one of them >> and if any error occurs before the qdisc watchdog timer and qdisc work are >> initialized then we can hit either a null ptr deref (timer->base) when >> canceling in ->destroy or lockdep error info about trying to register >> a non-static key and a stack dump. So to fix these two move the watchdog >> timer and workqueue init before anything that can err out. >> To reproduce userspace needs to send broken htb qdisc create request, >> tested with a modified tc (q_htb.c). > >> Note that probably this bug goes further back because the default qdisc >> handling always calls ->destroy on init failure too. >> >> Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation") >> Signed-off-by: Nikolay Aleksandrov <niko...@cumulusnetworks.com> >> --- >> Always calling qdisc destroy on init failure in the default qdisc handling >> was added in commit 0fbbeb1ba43b. I'm not sure if I should include that >> one as fixes tag. > > Well, we probably need to audit init/destroy not only in net/sched, but > other parts of networking stack. >
I'm not sure I follow, I hit this while working on a net/sched/ patch and had to error out in the init() function. > What about the qdisc_skb_head_init(&q->direct_queue) call ? > > I am surprised you do not crash in __skb_queue_purge(&q->direct_queue); > Hm, do you mean in __qdisc_reset_queue() ? I have only tried/seen the crash happen on qdisc add. A much simpler and easier bug is sch_multiq (ethX non-multiq): $ tc qdisc add dev ethX root multiq (error EOPNOTSUPP + double free of ->queues due to free in init()) and e.g. $ ip l add dumdum type dummy to see a crash due to the corrupted memory.