On Wed, Apr 14, 2021 at 8:58 AM Zhang, Qiang <qiang.zh...@windriver.com> wrote: > ________________________________________ > 发件人: Dmitry Vyukov <dvyu...@google.com> > 发送时间: 2021年4月13日 23:29 > 收件人: Zhang, Qiang > 抄送: Andrew Halaney; andreyk...@gmail.com; ryabinin....@gmail.com; > a...@linux-foundation.org; linux-kernel@vger.kernel.org; > kasan-...@googlegroups.com > 主题: Re: Question on KASAN calltrace record in RT > > [Please note: This e-mail is from an EXTERNAL e-mail address] > > On Tue, Apr 6, 2021 at 10:26 AM Zhang, Qiang <qiang.zh...@windriver.com> > wrote: > > > > Hello everyone > > > > In RT system, after Andrew test, found the following calltrace , > > in KASAN, we record callstack through stack_depot_save(), in this function, > > may be call alloc_pages, but in RT, the spin_lock replace with > > rt_mutex in alloc_pages(), if before call this function, the irq is > > disabled, > > will trigger following calltrace. > > > > maybe add array[KASAN_STACK_DEPTH] in struct kasan_track to record > > callstack in RT system. > > > > Is there a better solution ? > > >Hi Qiang, > > > >Adding 2 full stacks per heap object can increase memory usage too >much. > >The stackdepot has a preallocation mechanism, I would start with > >adding interrupts check here: > >https://elixir.bootlin.com/linux/v5.12-rc7/source/lib/stackdepot.c#L294 > >and just not do preallocation in interrupt context. This will solve > >the problem, right? > > It seems to be useful, however, there are the following situations > If there is a lot of stack information that needs to be saved in interrupts, > the memory which has been allocated to hold the stack information is > depletion, when need to save stack again in interrupts, there will be no > memory available .
Yes, this is true. This also true now because we allocate with GFP_ATOMIC. This is deliberate design decision. Note that a unique allocation stack is saved only once, so it's enough to be lucky only once per stack. Also interrupts don't tend to allocate thousands of objects. So I think all in all it should work fine in practice. If it turns out to be a problem, we could simply preallocate more memory in RT config. > Thanks > Qiang > > > > Thanks > > Qiang > > > > BUG: sleeping function called from invalid context at > > kernel/locking/rtmutex.c:951 > > [ 14.522262] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 640, > > name: mount > > [ 14.522304] Call Trace: > > [ 14.522306] dump_stack+0x92/0xc1 > > [ 14.522313] ___might_sleep.cold.99+0x1b0/0x1ef > > [ 14.522319] rt_spin_lock+0x3e/0xc0 > > [ 14.522329] local_lock_acquire+0x52/0x3c0 > > [ 14.522332] get_page_from_freelist+0x176c/0x3fd0 > > [ 14.522543] __alloc_pages_nodemask+0x28f/0x7f0 > > [ 14.522559] stack_depot_save+0x3a1/0x470 > > [ 14.522564] kasan_save_stack+0x2f/0x40 > > [ 14.523575] kasan_record_aux_stack+0xa3/0xb0 > > [ 14.523580] insert_work+0x48/0x340 > > [ 14.523589] __queue_work+0x430/0x1280 > > [ 14.523595] mod_delayed_work_on+0x98/0xf0 > > [ 14.523607] kblockd_mod_delayed_work_on+0x17/0x20 > > [ 14.523611] blk_mq_run_hw_queue+0x151/0x2b0 > > [ 14.523620] blk_mq_sched_insert_request+0x2ad/0x470 > > [ 14.523633] blk_mq_submit_bio+0xd2a/0x2330 > > [ 14.523675] submit_bio_noacct+0x8aa/0xfe0 > > [ 14.523693] submit_bio+0xf0/0x550 > > [ 14.523714] submit_bio_wait+0xfe/0x200 > > [ 14.523724] xfs_rw_bdev+0x370/0x480 [xfs] > > [ 14.523831] xlog_do_io+0x155/0x320 [xfs] > > [ 14.524032] xlog_bread+0x23/0xb0 [xfs] > > [ 14.524133] xlog_find_head+0x131/0x8b0 [xfs] > > [ 14.524375] xlog_find_tail+0xc8/0x7b0 [xfs] > > [ 14.524828] xfs_log_mount+0x379/0x660 [xfs] > > [ 14.524927] xfs_mountfs+0xc93/0x1af0 [xfs] > > [ 14.525424] xfs_fs_fill_super+0x923/0x17f0 [xfs] > > [ 14.525522] get_tree_bdev+0x404/0x680 > > [ 14.525622] vfs_get_tree+0x89/0x2d0 > > [ 14.525628] path_mount+0xeb2/0x19d0 > > [ 14.525648] do_mount+0xcb/0xf0 > > [ 14.525665] __x64_sys_mount+0x162/0x1b0 > > [ 14.525670] do_syscall_64+0x33/0x40 > > [ 14.525674] entry_SYSCALL_64_after_hwframe+0x44/0xae > > [ 14.525677] RIP: 0033:0x7fd6c15eaade