On Wed, 2017-05-31 at 13:21 -0400, Dave Jones wrote: > Just hit this during a trinity run.
925bb1ce47f429f69aad35876df7ecd8c53deb7e is the first bad commit commit 925bb1ce47f429f69aad35876df7ecd8c53deb7e Author: Vegard Nossum <vegard.nos...@oracle.com> Date: Thu May 11 12:18:52 2017 +0200 tty: fix port buffer locking tty_insert_flip_string_fixed_flag() is racy against itself when called from the ioctl(TCXONC, TCION/TCIOFF) path [1] and the flush_to_ldisc() workqueue path [2]. The problem is that port->buf.tail->used is modified without consistent locking; the ioctl path takes tty->atomic_write_lock, whereas the workqueue path takes ldata->output_lock. We cannot simply take ldata->output_lock, since that is specific to the N_TTY line discipline. It might seem natural to try to take port->buf.lock inside tty_insert_flip_string_fixed_flag() and friends (where port->buf is actually used/modified), but this creates problems for flush_to_ldisc() which takes it before grabbing tty->ldisc_sem, o_tty->termios_rwsem, and ldata->output_lock. Therefore, the simplest solution for now seems to be to take tty->atomic_write_lock inside tty_port_default_receive_buf(). This lock is also used in the write path [3] with a consistent ordering. [1]: Call Trace: tty_insert_flip_string_fixed_flag pty_write tty_send_xchar // down_read(&o_tty->termios_rwsem) // mutex_lock(&tty->atomic_write_lock) n_tty_ioctl_helper n_tty_ioctl tty_ioctl // down_read(&tty->ldisc_sem) do_vfs_ioctl SyS_ioctl [2]: Workqueue: events_unbound flush_to_ldisc Call Trace: tty_insert_flip_string_fixed_flag pty_write tty_put_char __process_echoes commit_echoes // mutex_lock(&ldata->output_lock) n_tty_receive_buf_common n_tty_receive_buf2 tty_ldisc_receive_buf // down_read(&o_tty->termios_rwsem) tty_port_default_receive_buf // down_read(&tty->ldisc_sem) flush_to_ldisc // mutex_lock(&port->buf.lock) process_one_work [3]: Call Trace: tty_insert_flip_string_fixed_flag pty_write n_tty_write // mutex_lock(&ldata->output_lock) // down_read(&tty->termios_rwsem) do_tty_write (inline) // mutex_lock(&tty->atomic_write_lock) tty_write // down_read(&tty->ldisc_sem) __vfs_write vfs_write SyS_write The bug can result in about a dozen different crashes depending on what exactly gets corrupted when port->buf.tail->used points outside the buffer. The patch passes my LOCKDEP/PROVE_LOCKING testing but more testing is always welcome. Found using syzkaller. Cc: <sta...@vger.kernel.org> Signed-off-by: Vegard Nossum <vegard.nos...@oracle.com> Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org> :040000 040000 6df9d7d2f22ff6ac1ad265869c1cb0d15621a0eb 56e5e66a5e451bf55857025899f4a092b51dca6e M drivers git bisect start # good: [a351e9b9fc24e982ec2f0e76379a49826036da12] Linux 4.11 git bisect good a351e9b9fc24e982ec2f0e76379a49826036da12 # bad: [104c08ba8e921ef97abfdc10408d54921a6d9003] Merge tag 'acpi-4.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm git bisect bad 104c08ba8e921ef97abfdc10408d54921a6d9003 # good: [16a12fa9aed176444fc795b09e796be41902bb08] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input git bisect good 16a12fa9aed176444fc795b09e796be41902bb08 # good: [2d3e4866dea96b0506395b47bfefb234f2088dac] Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm git bisect good 2d3e4866dea96b0506395b47bfefb234f2088dac # good: [f94c128eefcce2e3448d543f13cd7d7b8aa660a5] Merge tag 'metag-for-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag git bisect good f94c128eefcce2e3448d543f13cd7d7b8aa660a5 # good: [18365225f0440d09708ad9daade2ec11275c3df9] hwpoison, memcg: forcibly uncharge LRU pages git bisect good 18365225f0440d09708ad9daade2ec11275c3df9 # good: [5bbecdbc8e7ffaaf47ac1f02014bf3bedda3fd11] nvme_fc: Support ctrl_loss_tmo git bisect good 5bbecdbc8e7ffaaf47ac1f02014bf3bedda3fd11 # good: [d024baa58a4a7e5eb6058017771d15b9e47b56db] Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip git bisect good d024baa58a4a7e5eb6058017771d15b9e47b56db # bad: [393bcfaeb8be7f46a4cd7d673e33541ebee76b12] Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending git bisect bad 393bcfaeb8be7f46a4cd7d673e33541ebee76b12 # bad: [b0f5a8f32e8bbdaae1abb8abe2d3cbafaba57e08] kthread: fix boot hang (regression) on MIPS/OpenRISC git bisect bad b0f5a8f32e8bbdaae1abb8abe2d3cbafaba57e08 # good: [6f68a6ae1f0ea2fd87e26dc345866e928b5850a8] Merge tag 'powerpc-4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux git bisect good 6f68a6ae1f0ea2fd87e26dc345866e928b5850a8 # bad: [da4c279942b05727088774df224c0734688b4cbc] serial: enable serdev support git bisect bad da4c279942b05727088774df224c0734688b4cbc # good: [5c9d6abed9e0a061de252a53ab687a1171502e81] serial: altera_jtaguart: adding iounmap() git bisect good 5c9d6abed9e0a061de252a53ab687a1171502e81 # bad: [925bb1ce47f429f69aad35876df7ecd8c53deb7e] tty: fix port buffer locking git bisect bad 925bb1ce47f429f69aad35876df7ecd8c53deb7e # good: [11d4d32158eeaf36fe1073a4a260193d7a19ccf1] tty: ehv_bytechan: clean up init error handling git bisect good 11d4d32158eeaf36fe1073a4a260193d7a19ccf1 # first bad commit: [925bb1ce47f429f69aad35876df7ecd8c53deb7e] tty: fix port buffer locking [ 56.727594] ====================================================== [ 56.728482] WARNING: possible circular locking dependency detected [ 56.729318] 4.12.0-master #108 Tainted: G E [ 56.730125] ------------------------------------------------------ [ 56.731024] kworker/u8:3/329 is trying to acquire lock: [ 56.731774] (&buf->lock){+.+...}, at: [<ffffffff81423e97>] tty_buffer_flush+0x37/0xa0 [ 56.732773] but task is already holding lock: [ 56.733444] (&o_tty->termios_rwsem/1){++++..}, at: [<ffffffff8141fb91>] isig+0x41/0xf0 [ 56.734303] which lock already depends on the new lock. [ 56.734868] the existing dependency chain (in reverse order) is: [ 56.735377] -> #2 (&o_tty->termios_rwsem/1){++++..}: [ 56.735799] lock_acquire+0xbd/0x200 [ 56.736071] down_read+0x4c/0x70 [ 56.736323] n_tty_write+0x90/0x450 [ 56.736584] tty_write+0x159/0x2e0 [ 56.736844] __vfs_write+0x28/0x140 [ 56.737119] vfs_write+0xc4/0x1c0 [ 56.737376] redirected_tty_write+0x5c/0xa0 [ 56.737681] __do_readv_writev+0x2b4/0x350 [ 56.737980] do_readv_writev+0x63/0x80 [ 56.738263] vfs_writev+0x3c/0x50 [ 56.738513] do_writev+0x55/0xd0 [ 56.738758] SyS_writev+0x10/0x20 [ 56.739009] entry_SYSCALL_64_fastpath+0x1f/0xbe [ 56.739345] -> #1 (&tty->atomic_write_lock){+.+.+.}: [ 56.739760] lock_acquire+0xbd/0x200 [ 56.740026] __mutex_lock+0x75/0x8e0 [ 56.740297] mutex_lock_nested+0x1b/0x20 [ 56.740585] tty_port_default_receive_buf+0x48/0x90 [ 56.740932] flush_to_ldisc+0x7f/0xa0 [ 56.741233] process_one_work+0x1c5/0x5c0 [ 56.741528] worker_thread+0x4d/0x3c0 [ 56.741800] kthread+0x114/0x150 [ 56.742046] ret_from_fork+0x2a/0x40 [ 56.742318] -> #0 (&buf->lock){+.+...}: [ 56.742664] __lock_acquire+0x13e1/0x1400 [ 56.742957] lock_acquire+0xbd/0x200 [ 56.743228] __mutex_lock+0x75/0x8e0 [ 56.743496] mutex_lock_nested+0x1b/0x20 [ 56.743784] tty_buffer_flush+0x37/0xa0 [ 56.744068] pty_flush_buffer+0x27/0x80 [ 56.744462] tty_driver_flush_buffer+0x1b/0x20 [ 56.744776] isig+0x95/0xf0 [ 56.744999] n_tty_receive_signal_char+0x1c/0x70 [ 56.745338] n_tty_receive_char_special+0x863/0xb00 [ 56.745660] n_tty_receive_buf_common+0x7ae/0xac0 [ 56.746013] n_tty_receive_buf2+0x14/0x20 [ 56.746311] tty_ldisc_receive_buf+0x23/0x50 [ 56.746598] tty_port_default_receive_buf+0x59/0x90 [ 56.746960] flush_to_ldisc+0x7f/0xa0 [ 56.747237] process_one_work+0x1c5/0x5c0 [ 56.747509] worker_thread+0x4d/0x3c0 [ 56.747761] kthread+0x114/0x150 [ 56.748016] ret_from_fork+0x2a/0x40 [ 56.748300] other info that might help us debug this: [ 56.748835] Chain exists of: &buf->lock --> &tty->atomic_write_lock --> &o_tty->termios_rwsem/1 [ 56.749581] Possible unsafe locking scenario: [ 56.749994] CPU0 CPU1 [ 56.750294] ---- ---- [ 56.750564] lock(&o_tty->termios_rwsem/1); [ 56.750855] lock(&tty->atomic_write_lock); [ 56.751314] lock(&o_tty->termios_rwsem/1); [ 56.751808] lock(&buf->lock); [ 56.752021] *** DEADLOCK *** [ 56.752380] 6 locks held by kworker/u8:3/329: [ 56.752641] #0: ("events_unbound"){.+.+.+}, at: [<ffffffff8108861f>] process_one_work+0x13f/0x5c0 [ 56.753203] #1: ((&buf->work)){+.+...}, at: [<ffffffff8108861f>] process_one_work+0x13f/0x5c0 [ 56.753717] #2: (&port->buf.lock/1){+.+...}, at: [<ffffffff81423935>] flush_to_ldisc+0x25/0xa0 [ 56.754243] #3: (&tty->ldisc_sem){++++.+}, at: [<ffffffff81422ccf>] tty_ldisc_ref+0x1f/0x60 [ 56.754748] #4: (&tty->atomic_write_lock){+.+.+.}, at: [<ffffffff814242b8>] tty_port_default_receive_buf+0x48/0x90 [ 56.755373] #5: (&o_tty->termios_rwsem/1){++++..}, at: [<ffffffff8141fb91>] isig+0x41/0xf0 [ 56.755875] stack backtrace: [ 56.756139] CPU: 2 PID: 329 Comm: kworker/u8:3 Tainted: G E 4.12.0-master #108 [ 56.756637] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 [ 56.757327] Workqueue: events_unbound flush_to_ldisc [ 56.757629] Call Trace: [ 56.757830] dump_stack+0x85/0xc9 [ 56.758048] print_circular_bug+0x1f9/0x207 [ 56.758315] __lock_acquire+0x13e1/0x1400 [ 56.758564] ? check_irq_usage+0x83/0xb0 [ 56.758882] lock_acquire+0xbd/0x200 [ 56.759113] ? tty_buffer_flush+0x37/0xa0 [ 56.759463] ? tty_buffer_flush+0x37/0xa0 [ 56.759737] __mutex_lock+0x75/0x8e0 [ 56.759998] ? tty_buffer_flush+0x37/0xa0 [ 56.760250] ? isig+0x5e/0xf0 [ 56.760433] ? tty_buffer_flush+0x37/0xa0 [ 56.760677] ? find_held_lock+0x35/0xa0 [ 56.760957] ? isig+0x8d/0xf0 [ 56.761140] mutex_lock_nested+0x1b/0x20 [ 56.761383] tty_buffer_flush+0x37/0xa0 [ 56.761617] pty_flush_buffer+0x27/0x80 [ 56.761870] tty_driver_flush_buffer+0x1b/0x20 [ 56.762162] isig+0x95/0xf0 [ 56.762340] n_tty_receive_signal_char+0x1c/0x70 [ 56.762619] n_tty_receive_char_special+0x863/0xb00 [ 56.762960] n_tty_receive_buf_common+0x7ae/0xac0 [ 56.763250] ? tty_ldisc_ref+0x1f/0x60 [ 56.763478] n_tty_receive_buf2+0x14/0x20 [ 56.763721] tty_ldisc_receive_buf+0x23/0x50 [ 56.763983] ? mutex_lock_nested+0x1b/0x20 [ 56.764238] tty_port_default_receive_buf+0x59/0x90 [ 56.764533] flush_to_ldisc+0x7f/0xa0 [ 56.764756] process_one_work+0x1c5/0x5c0 [ 56.765075] ? process_one_work+0x13f/0x5c0 [ 56.765366] worker_thread+0x4d/0x3c0 [ 56.765590] kthread+0x114/0x150 [ 56.765787] ? process_one_work+0x5c0/0x5c0 [ 56.766094] ? kthread_create_on_node+0x40/0x40 [ 56.766374] ret_from_fork+0x2a/0x40