On Thu, May 05, 2016 at 01:18:43PM +0300, Kirill A. Shutemov wrote: > On Thu, May 05, 2016 at 09:32:45AM +0800, kernel test robot wrote: > > FYI, we noticed the following commit: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master > > commit 409ca714ac58768342cd39ca79c16f51e1825b3e ("mm, thp: avoid > > unnecessary swapin in khugepaged") > > > > on test machine: vm-kbuild-1G: 2 threads qemu-system-x86_64 -enable-kvm > > -cpu Haswell,+smep,+smap with 1G memory > > > > caused below changes: > > > <trying to dedup strings, it's really annoying> > > > > > [ 21.116124] ====================================================== > > [ 21.116124] [ INFO: possible circular locking dependency detected ] > > [ 21.116127] 4.6.0-rc5-00302-g409ca71 #1 Not tainted > > [ 21.116127] ------------------------------------------------------- > > [ 21.116128] udevadm/221 is trying to acquire lock: > > [ 21.116138] (&mm->mmap_sem){++++++}, at: [<ffffffff81262543>] > > __might_fault+0x83/0x150 > > [ 21.116138] > > [ 21.116138] but task is already holding lock: > > [ 21.116144] (s_active#12){++++.+}, at: [<ffffffff813315ee>] > > kernfs_fop_write+0x8e/0x250 > > [ 21.116144] > > [ 21.116144] which lock already depends on the new lock. > > [ 21.116144] > > [ 21.116145] the existing dependency chain (in reverse order) is: > > [ 21.116148] > > [ 21.116148] -> #2 (s_active#12){++++.+}: > > [ 21.116152] [<ffffffff8117da2c>] lock_acquire+0xac/0x180 > > [ 21.116155] [<ffffffff8132f50a>] __kernfs_remove+0x2da/0x410 > > [ 21.116158] [<ffffffff81330630>] > > kernfs_remove_by_name_ns+0x40/0x90 > > [ 21.116160] [<ffffffff813339fb>] sysfs_remove_file_ns+0x2b/0x70 > > [ 21.116164] [<ffffffff81ba8a16>] device_del+0x166/0x320 > > [ 21.116166] [<ffffffff81ba943c>] device_destroy+0x3c/0x50 > > [ 21.116170] [<ffffffff8105aa61>] > > cpuid_class_cpu_callback+0x51/0x70 > > [ 21.116173] [<ffffffff81131ce9>] notifier_call_chain+0x59/0x190 > > [ 21.116177] [<ffffffff81132749>] > > __raw_notifier_call_chain+0x9/0x10 > > [ 21.116180] [<ffffffff810fe6b0>] __cpu_notify+0x40/0x90 > > [ 21.116182] [<ffffffff810fe890>] cpu_notify_nofail+0x10/0x30 > > [ 21.116185] [<ffffffff810fe8d7>] notify_dead+0x27/0x1e0 > > [ 21.116187] [<ffffffff810fe273>] cpuhp_down_callbacks+0x93/0x190 > > [ 21.116192] [<ffffffff82096062>] _cpu_down+0xc2/0x1e0 > > [ 21.116194] [<ffffffff810ff727>] do_cpu_down+0x37/0x50 > > [ 21.116197] [<ffffffff8110003b>] cpu_down+0xb/0x10 > > [ 21.116201] [<ffffffff81038e4d>] _debug_hotplug_cpu+0x7d/0xd0 > > [ 21.116205] [<ffffffff8435d6bb>] debug_hotplug_cpu+0xd/0x11 > > [ 21.116208] [<ffffffff84352426>] do_one_initcall+0x138/0x1cf > > [ 21.116211] [<ffffffff8435270a>] kernel_init_freeable+0x24d/0x2de > > [ 21.116214] [<ffffffff8209533a>] kernel_init+0xa/0x120 > > [ 21.116217] [<ffffffff820a7972>] ret_from_fork+0x22/0x50 > > [ 21.116221] > > [ 21.116221] -> #1 (cpu_hotplug.lock#2){+.+.+.}: > > [ 21.116223] [<ffffffff8117da2c>] lock_acquire+0xac/0x180 > > [ 21.116226] [<ffffffff820a20d1>] mutex_lock_nested+0x71/0x4c0 > > [ 21.116228] [<ffffffff810ff526>] get_online_cpus+0x66/0x80 > > [ 21.116232] [<ffffffff81246fb3>] sum_vm_event+0x23/0x1b0 > > [ 21.116236] [<ffffffff81293768>] collapse_huge_page+0x118/0x10b0 > > [ 21.116238] [<ffffffff81294c5d>] khugepaged+0x55d/0xe80 > > [ 21.116240] [<ffffffff81130304>] kthread+0x134/0x1a0 > > [ 21.116242] [<ffffffff820a7972>] ret_from_fork+0x22/0x50 > > [ 21.116244] > > [ 21.116244] -> #0 (&mm->mmap_sem){++++++}: > > [ 21.116246] [<ffffffff8117bf61>] __lock_acquire+0x2861/0x31f0 > > [ 21.116248] [<ffffffff8117da2c>] lock_acquire+0xac/0x180 > > [ 21.116251] [<ffffffff8126257e>] __might_fault+0xbe/0x150 > > [ 21.116253] [<ffffffff8133160f>] kernfs_fop_write+0xaf/0x250 > > [ 21.116256] [<ffffffff812a8933>] __vfs_write+0x43/0x1a0 > > [ 21.116258] [<ffffffff812a8d3a>] vfs_write+0xda/0x240 > > [ 21.116260] [<ffffffff812a8f84>] SyS_write+0x44/0xa0 > > [ 21.116263] [<ffffffff820a773c>] > > entry_SYSCALL_64_fastpath+0x1f/0xbd > > [ 21.116264] > > [ 21.116264] other info that might help us debug this: > > [ 21.116264] > > [ 21.116268] Chain exists of: > > [ 21.116268] &mm->mmap_sem --> cpu_hotplug.lock#2 --> s_active#12 > > [ 21.116268] > > [ 21.116268] Possible unsafe locking scenario: > > [ 21.116268] > > [ 21.116269] CPU0 CPU1 > > [ 21.116269] ---- ---- > > [ 21.116270] lock(s_active#12); > > [ 21.116271] lock(cpu_hotplug.lock#2); > > [ 21.116272] lock(s_active#12); > > [ 21.116273] lock(&mm->mmap_sem); > > [ 21.116274] > > [ 21.116274] *** DEADLOCK *** > > [ 21.116274] > > [ 21.116274] *** DEADLOCK *** > > [ 21.116274] > > [ 21.116274] 3 locks held by udevadm/221: > > [ 21.116278] #0: (sb_writers#3){.+.+.+}, at: [<ffffffff812ad64d>] > > __sb_start_write+0x6d/0x120 > > [ 21.116280] #1: (&of->mutex){+.+.+.}, at: [<ffffffff813315e6>] > > kernfs_fop_write+0x86/0x250 > > [ 21.116282] #2: (s_active#12){++++.+}, at: [<ffffffff813315ee>] > > kernfs_fop_write+0x8e/0x250 > > [ 21.116283] > > [ 21.116283] stack backtrace: > > [ 21.116283] > > [ 21.116283] stack backtrace: > > [ 21.116284] CPU: 1 PID: 221 Comm: udevadm Not tainted > > 4.6.0-rc5-00302-g409ca71 #1 > > [ 21.116287] ffff88003f698000 ffff88003f077bf0 ffffffff81444ef3 > > 0000000000000011 > > [ 21.116288] ffffffff84bdd8f0 ffffffff84bf2630 ffff88003f077c40 > > ffffffff81173e91 > > [ 21.116290] 0000000000000000 ffffffff84fbdbc0 00ff88003f077c40 > > ffff88003f698bb8 > > [ 21.116290] Call Trace: > > [ 21.116293] [<ffffffff81444ef3>] dump_stack+0x86/0xd3 > > [ 21.116294] [<ffffffff81173e91>] print_circular_bug+0x221/0x360 > > [ 21.116296] [<ffffffff8117bf61>] __lock_acquire+0x2861/0x31f0 > > [ 21.116297] [<ffffffff8117da2c>] lock_acquire+0xac/0x180 > > [ 21.116299] [<ffffffff81262543>] ? __might_fault+0x83/0x150 > > [ 21.116300] [<ffffffff8126257e>] __might_fault+0xbe/0x150 > > [ 21.116302] [<ffffffff81262543>] ? __might_fault+0x83/0x150 > > [ 21.116303] [<ffffffff8133160f>] kernfs_fop_write+0xaf/0x250 > > [ 21.116304] [<ffffffff812a8933>] __vfs_write+0x43/0x1a0 > > [ 21.116306] [<ffffffff8116fe0d>] ? update_fast_ctr+0x1d/0x80 > > [ 21.116308] [<ffffffff8116ffe7>] ? percpu_down_read+0x57/0xa0 > > [ 21.116310] [<ffffffff812ad64d>] ? __sb_start_write+0x6d/0x120 > > [ 21.116311] [<ffffffff812ad64d>] ? __sb_start_write+0x6d/0x120 > > [ 21.116312] [<ffffffff812a8d3a>] vfs_write+0xda/0x240 > > [ 21.116314] [<ffffffff812a8f84>] SyS_write+0x44/0xa0 > > [ 21.116315] [<ffffffff820a773c>] entry_SYSCALL_64_fastpath+0x1f/0xbd > > If I read this correctly (I'm not sure about this), we shouldn't call > sum_vm_event() under mmap_sem. > > BTW, we do need mmap_sem for swapin, but there's no need for exclusive > one. It can be too expensive to do I/O with down_write(mmap_sem). > > Ebru, could look how to move sum_vm_event() outside mmap_sem and probably > have down_read(mmap_sem) during swapin? I don't have time for this right > now.
I started to work on down_read(mmap_sem) a few days ago. But firstly, I'll take this issue before down_read. kind regards.