From: Jiri Wiesner <jwies...@suse.com> Date: Wed, 5 Dec 2018 16:55:29 +0100
> The *_frag_reasm() functions are susceptible to miscalculating the byte > count of packet fragments in case the truesize of a head buffer changes. > The truesize member may be changed by the call to skb_unclone(), leaving > the fragment memory limit counter unbalanced even if all fragments are > processed. This miscalculation goes unnoticed as long as the network > namespace which holds the counter is not destroyed. > > Should an attempt be made to destroy a network namespace that holds an > unbalanced fragment memory limit counter the cleanup of the namespace > never finishes. The thread handling the cleanup gets stuck in > inet_frags_exit_net() waiting for the percpu counter to reach zero. The > thread is usually in running state with a stacktrace similar to: > > PID: 1073 TASK: ffff880626711440 CPU: 1 COMMAND: "kworker/u48:4" > #5 [ffff880621563d48] _raw_spin_lock at ffffffff815f5480 > #6 [ffff880621563d48] inet_evict_bucket at ffffffff8158020b > #7 [ffff880621563d80] inet_frags_exit_net at ffffffff8158051c > #8 [ffff880621563db0] ops_exit_list at ffffffff814f5856 > #9 [ffff880621563dd8] cleanup_net at ffffffff814f67c0 > #10 [ffff880621563e38] process_one_work at ffffffff81096f14 > > It is not possible to create new network namespaces, and processes > that call unshare() end up being stuck in uninterruptible sleep state > waiting to acquire the net_mutex. > > The bug was observed in the IPv6 netfilter code by Per Sundstrom. > I thank him for his analysis of the problem. The parts of this patch > that apply to IPv4 and IPv6 fragment reassembly are preemptive measures. > > Signed-off-by: Jiri Wiesner <jwies...@suse.com> > Reported-by: Per Sundstrom <per.sundst...@redqube.se> Nice catch. Applied and queued up for -stable, thanks!