Re: BUG: KASAN: use-after-free in fib_table_flush

Ido Schimmel Sun, 17 Dec 2017 08:08:53 -0800

+Alexander

On Sun, Dec 17, 2017 at 08:55:57PM +0800, Fengguang Wu wrote:
> Hello,
> 
> FYI this happens in mainline kernel 4.15.0-rc3.
> It looks like a new regression.
> 
> It occurs in 4 out of 28 boots.
> 
> [  166.090516] 
> ==================================================================
> [  166.092419] BUG: KASAN: use-after-free in fib_table_flush+0x76c/0x870:
>                                               fib_table_flush at 
> net/ipv4/fib_trie.c:1868
> [  166.092907] Read of size 8 at addr ffff880012fc0b18 by task 
> kworker/u2:3/173
> [  166.093402]
> [  166.093528] CPU: 0 PID: 173 Comm: kworker/u2:3 Not tainted 4.15.0-rc3 #31
> [  166.094018] Workqueue: netns cleanup_net
> [  166.094298] Call Trace:
> [  166.094489]  print_address_description+0xa6/0x370:
>                                               print_address_description at 
> mm/kasan/report.c:253
> [  166.094867]  ? fib_table_flush+0x76c/0x870:
>                                               fib_table_flush at 
> net/ipv4/fib_trie.c:1868
> [  166.095159]  kasan_report+0x226/0x330:
>                                               kasan_report_error at 
> mm/kasan/report.c:352
>                                                (inlined by) kasan_report at 
> mm/kasan/report.c:409
> [  166.095420]  fib_table_flush+0x76c/0x870:
>                                               fib_table_flush at 
> net/ipv4/fib_trie.c:1868
> [  166.095698]  ? fib_table_flush_external+0x5a0/0x5a0:
>                                               fib_table_flush at 
> net/ipv4/fib_trie.c:1836
> [  166.096067]  ? ip_fib_net_exit+0x94/0x360:
>                                               ip_fib_net_exit at 
> net/ipv4/fib_frontend.c:1313 (discriminator 16)
> [  166.096350]  ip_fib_net_exit+0x228/0x360:
>                                               ip_fib_net_exit at 
> net/ipv4/fib_frontend.c:1316
> [  166.096629]  ? ip_fib_net_exit+0x360/0x360:
>                                               fib_net_exit at 
> net/ipv4/fib_frontend.c:1355
> [  166.096930]  ops_exit_list+0xa8/0x160
> [  166.097233]  cleanup_net+0x414/0x860:
>                                               cleanup_net at 
> net/core/net_namespace.c:483 (discriminator 9)
> [  166.097487]  ? net_drop_ns+0x80/0x80:
>                                               cleanup_net at 
> net/core/net_namespace.c:439
> [  166.097748]  ? kvm_sched_clock_read+0x5/0x10:
>                                               kvm_sched_clock_read at 
> arch/x86/kernel/kvmclock.c:101
> [  166.098051]  ? native_sched_clock_from_tsc+0x40/0x70:
>                                               __preempt_count_dec_and_test at 
> arch/x86/include/asm/preempt.h:91
>                                                (inlined by) cyc2ns_read_end 
> at arch/x86/kernel/tsc.c:81
>                                                (inlined by) cycles_2_ns at 
> arch/x86/kernel/tsc.c:135
>                                                (inlined by) 
> native_sched_clock_from_tsc at arch/x86/kernel/tsc.c:219
> [  166.098399]  ? sched_clock_cpu+0xf/0x70:
>                                               sched_clock_cpu at 
> kernel/sched/clock.c:363
> [  166.098672]  ? __lock_acquire+0x3b2/0x1fc0
> [  166.099054]  ? lock_downgrade+0x6a0/0x6a0:
>                                               lock_release at 
> kernel/locking/lockdep.c:4013
> [  166.099337]  ? lock_acquire+0x117/0x260:
>                                               get_current at 
> arch/x86/include/asm/current.h:15
>                                                (inlined by) lock_acquire at 
> kernel/locking/lockdep.c:4006
> [  166.099609]  ? process_one_work+0x70f/0x11c0:
>                                               process_one_work at 
> kernel/workqueue.c:2087
> [  166.099938]  process_one_work+0x791/0x11c0:
>                                               process_one_work at 
> kernel/workqueue.c:2118
> [  166.100229]  ? kvm_sched_clock_read+0x5/0x10:
>                                               kvm_sched_clock_read at 
> arch/x86/kernel/kvmclock.c:101
> [  166.100532]  ? sched_clock+0x2d/0x40:
>                                               paravirt_sched_clock at 
> arch/x86/include/asm/paravirt.h:174
>                                                (inlined by) sched_clock at 
> arch/x86/kernel/tsc.c:227
> [  166.100792]  ? cancel_delayed_work_sync+0x20/0x20:
>                                               process_one_work at 
> kernel/workqueue.c:2014
> [  166.101123]  worker_thread+0xe8/0x1070:
>                                               __read_once_size at 
> include/linux/compiler.h:183
>                                                (inlined by) list_empty at 
> include/linux/list.h:203
>                                                (inlined by) worker_thread at 
> kernel/workqueue.c:2247
> [  166.101392]  ? __kthread_parkme+0x164/0x230:
>                                               __kthread_parkme at 
> kernel/kthread.c:188
> [  166.101689]  ? process_one_work+0x11c0/0x11c0:
>                                               worker_thread at 
> kernel/workqueue.c:2189
> [  166.102006]  kthread+0x2fd/0x400:
>                                               kthread at kernel/kthread.c:238
> [  166.102240]  ? kthread_create_on_node+0xf0/0xf0:
>                                               kthread at kernel/kthread.c:198
> [  166.102561]  ret_from_fork+0x1f/0x30:
>                                               ret_from_fork at 
> arch/x86/entry/entry_64.S:447
> [  166.102855]
> [  166.102972] Allocated by task 1907:
> [  166.103235]  __kmalloc+0xf6/0x1a0:
>                                               __kmalloc at mm/slub.c:3765
> [  166.103475]  fib_trie_table+0xe8/0x240:
>                                               fib_trie_table at 
> net/ipv4/fib_trie.c:2081
> [  166.103748]  fib_net_init+0x1bc/0x570:
>                                               fib4_rules_init at 
> net/ipv4/fib_frontend.c:59
>                                                (inlined by) ip_fib_net_init 
> at net/ipv4/fib_frontend.c:1287
>                                                (inlined by) fib_net_init at 
> net/ipv4/fib_frontend.c:1335
> [  166.104032]  ops_init+0x1c0/0x360:
>                                               ops_init at 
> net/core/net_namespace.c:119
> [  166.104269]  setup_net+0x23c/0x530:
>                                               setup_net at 
> net/core/net_namespace.c:296
> [  166.104512]  copy_net_ns+0x170/0x350:
>                                               copy_net_ns at 
> net/core/net_namespace.c:420
> [  166.104779]  create_new_namespaces+0x343/0x730:
>                                               create_new_namespaces at 
> kernel/nsproxy.c:107
> [  166.105091]  unshare_nsproxy_namespaces+0xa1/0x150:
>                                               unshare_nsproxy_namespaces at 
> kernel/nsproxy.c:206 (discriminator 4)
> [  166.105427]  SyS_unshare+0x338/0x6c0
> [  166.105682]  do_syscall_64+0x21f/0xb80:
>                                               do_syscall_64 at 
> arch/x86/entry/common.c:285
> [  166.105954]  return_from_SYSCALL_64+0x0/0x65:
>                                               return_from_SYSCALL_64 at 
> arch/x86/entry/entry_64.S:259
> [  166.106253]
> [  166.106367] Freed by task 11:
> [  166.106581]  kfree+0x102/0x1d0:
>                                               slab_free at mm/slub.c:2973
>                                                (inlined by) kfree at 
> mm/slub.c:3899
> [  166.106838]  rcu_do_batch+0x331/0x7f0:
>                                               rcu_lock_release at 
> include/linux/rcupdate.h:249
>                                                (inlined by) __rcu_reclaim at 
> kernel/rcu/rcu.h:196
>                                                (inlined by) rcu_do_batch at 
> kernel/rcu/tree.c:2758
> [  166.107102]  rcu_cpu_kthread+0x12a/0x160:
>                                               rcu_preempt_do_callbacks at 
> kernel/rcu/tree_plugin.h:687
>                                                (inlined by) 
> rcu_kthread_do_work at kernel/rcu/tree_plugin.h:1142
>                                                (inlined by) rcu_cpu_kthread 
> at kernel/rcu/tree_plugin.h:1184
> [  166.107381]  smpboot_thread_fn+0x3c1/0x820:
>                                               smpboot_thread_fn at 
> kernel/smpboot.c:164
> [  166.107669]  kthread+0x2fd/0x400:
>                                               kthread at kernel/kthread.c:238
> [  166.107928]  ret_from_fork+0x1f/0x30:
>                                               ret_from_fork at 
> arch/x86/entry/entry_64.S:447
> [  166.108181]
> [  166.108295] The buggy address belongs to the object at ffff880012fc0ae0
> [  166.108295]  which belongs to the cache kmalloc-64 of size 64
> [  166.109179] The buggy address is located 56 bytes inside of
> [  166.109179]  64-byte region [ffff880012fc0ae0, ffff880012fc0b20)


Hi Alexander,

Note that CONFIG_IP_MULTIPLE_TABLES is disabled, so both the main and
local table are allocated during init and also share the same trie.

I think that what happens is that ip_fib_net_exit() frees the main table
and its trie via an RCU callback which is scheduled before the local
table is iterated over, thus resulting in a use-after-free.

I can reliably trigger the bug by adding synchronize_rcu() at the end of
each iteration of the loop.

Problem goes away if we iterate over the tables in reverse order which
is symmetric to fib4_rules_init().

What do you think?

Re: BUG: KASAN: use-after-free in fib_table_flush

Reply via email to