David Wilder <dwil...@us.ibm.com> wrote:
> This crash happened on a ppc64le system running ltp network tests when ltp 
> script ran "rmmod iptable_mangle".
> 
> [213425.602369] BUG: Kernel NULL pointer dereference at 0x00000010
> [213425.602388] Faulting instruction address: 0xc008000000550bdc
[..]

> In the crash we find in iptable_mangle_hook() that 
> state->net->ipv4.iptable_mangle=NULL causing a NULL pointer dereference. 
> net->ipv4.iptable_mangle is set to NULL in iptable_mangle_net_exit() and 
> called when ip_mangle modules is unloaded. A rmmod task was found in the 
> crash dump.  A 2nd crash showed the same problem when running "rmmod 
> iptable_filter" (net->ipv4.iptable_filter=NULL).
> 
> Once a hook is registered packets will picked up a pointer from: 
> net->ipv4.iptable_$table. The patch adds a call to synchronize_net() in 
> ipt_unregister_table() to insure no packets are in flight that have picked up 
> the pointer before completing the un-register.
> 
> This change has has prevented the problem in our testing.  However, we have 
> concerns with this change as it would mean that on netns cleanup, we would 
> need one synchronize_net() call for every table in use. Also, on module 
> unload, there would be one synchronize_net() for every existing netns.

Yes, I agree with the analysis.

> Signed-off-by: David Wilder <dwil...@us.ibm.com>
> ---
>  net/ipv4/netfilter/ip_tables.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
> index c2670ea..97c4121 100644
> --- a/net/ipv4/netfilter/ip_tables.c
> +++ b/net/ipv4/netfilter/ip_tables.c
> @@ -1800,8 +1800,10 @@ int ipt_register_table(struct net *net, const struct 
> xt_table *table,
>  void ipt_unregister_table(struct net *net, struct xt_table *table,
>                         const struct nf_hook_ops *ops)
>  {
> -     if (ops)
> +     if (ops) {
>               nf_unregister_net_hooks(net, ops, 
> hweight32(table->valid_hooks));
> +             synchronize_net();
> +     }

I'd wager ebtables, arptables and ip6tables have the same bug.

The extra synchronize_net() isn't ideal.  We could probably do it this
way and then improve in a second patch.

One way to fix this without a new synchronize_net() is to switch all
iptable_foo.c to use ".pre_exit" hook as well.

pre_exit would unregister the underlying hook and .exit would to the
table freeing.

Since the netns core already does an unconditional synchronize_rcu after
the pre_exit hooks this would avoid the problem as well.

Reply via email to