(cc'ing Ilya Zykov <i...@ilyx.ru> because the test jig below is based on his test program from https://lkml.org/lkml/2012/11/29/368 -- just want to give credit where credit is due)
On Fri, 2012-11-30 at 18:52 -0500, Sasha Levin wrote: > > Still reproducible, I'm still seeing this with the patch above applied: > > [ 1315.419759] ------------[ cut here ]------------ > [ 1315.420611] WARNING: at drivers/tty/tty_buffer.c:476 > flush_to_ldisc+0x60/0x200() > [ 1315.423098] tty is NULL Thanks for sticking with this Sasha. Finally me too. --- [ 88.331234] WARNING: at /home/peter/src/kernels/next/drivers/tty/tty_buffer.c:435 flush_to_ldisc+0x194/0x1d0() [ 88.334505] Hardware name: Bochs [ 88.335618] tty is bad=-1 [ 88.335703] Modules linked in: netconsole configfs bnep rfcomm bluetooth snd_hda_intel snd_hda_codec snd_hwdep parport_pc ppdev snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device mac_hid psmouse snd i2c_piix4 soundcore snd_page_alloc microcode serio_raw virtio_balloon lp parport floppy 8139too 8139cp [ 88.345272] Pid: 39, comm: kworker/1:1 Tainted: G W 3.7.0-next-20121129+ttydebug-xeon #20121129+ttydebug [ 88.347736] Call Trace: [ 88.349024] [<ffffffff81058aff>] warn_slowpath_common+0x7f/0xc0 [ 88.350383] [<ffffffff81058bf6>] warn_slowpath_fmt+0x46/0x50 [ 88.351745] [<ffffffff81432bd4>] flush_to_ldisc+0x194/0x1d0 [ 88.353047] [<ffffffff816f7fe1>] ? _raw_spin_unlock_irq+0x21/0x50 [ 88.354190] [<ffffffff8108a809>] ? finish_task_switch+0x49/0xe0 [ 88.355436] [<ffffffff81077ad1>] process_one_work+0x121/0x490 [ 88.357674] [<ffffffff81432a40>] ? __tty_buffer_flush+0x90/0x90 [ 88.358954] [<ffffffff81078c84>] worker_thread+0x164/0x3e0 [ 88.360247] [<ffffffff81078b20>] ? manage_workers+0x120/0x120 [ 88.361282] [<ffffffff8107e230>] kthread+0xc0/0xd0 [ 88.362284] [<ffffffff816f0000>] ? cmos_do_probe+0x2eb/0x3bf [ 88.363391] [<ffffffff8107e170>] ? flush_kthread_worker+0xb0/0xb0 [ 88.364797] [<ffffffff816fff6c>] ret_from_fork+0x7c/0xb0 [ 88.366087] [<ffffffff8107e170>] ? flush_kthread_worker+0xb0/0xb0 [ 88.367266] ---[ end trace 453a7c9f38fbfec0 ]--- I figured out how to make this reproduce easily. The test jig at the end of this email will generate this multiple times a second. The test creates a pty pair and spawns a child which writes to the slave pts, while the parent waits for the first write and then abruptly closes the master ptm and kills the child. (Just in case, I'd only run the jig in a disposable vm. Obviously, the vm needs multiple cores and extra pty serial devices ;) >From instrumenting the tty_release() path, it's clear that tty_buffer work is still scheduled even after tty_release_ldisc() has run. For example, with this patch I get the warning below it. [Further analysis to follow in subsequent mail...] --- >% --- [PATCH -next] tty: WARN if buffer work racing with tty free Signed-off-by: Peter Hurley <pe...@hurleysoftware.com> --- drivers/tty/tty_io.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c index 1ce50ec..9d53aec 100644 --- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -1511,6 +1511,8 @@ static void queue_release_one_tty(struct kref *kref) { struct tty_struct *tty = container_of(kref, struct tty_struct, kref); + WARN_ON(work_pending(&tty->port->buf.work)); + /* The hangup queue is now free so we can reuse it rather than waste a chunk of memory for each port */ INIT_WORK(&tty->hangup_work, release_one_tty); -- 1.8.0 [ 88.376473] ------------[ cut here ]------------ [ 88.377513] WARNING: at /home/peter/src/kernels/next/drivers/tty/tty_io.c:1514 queue_release_one_tty+0x6e/0x70() [ 88.379566] Hardware name: Bochs [ 88.380665] Modules linked in: netconsole configfs bnep rfcomm bluetooth snd_hda_intel snd_hda_codec snd_hwdep parport_pc ppdev snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device mac_hid psmouse snd i2c_piix4 soundcore snd_page_alloc microcode serio_raw virtio_balloon lp parport floppy 8139too 8139cp [ 88.387282] Pid: 1834, comm: pty_kill Tainted: G W 3.7.0-next-20121129+ttydebug-xeon #20121129+ttydebug [ 88.389696] Call Trace: [ 88.390883] [<ffffffff81058aff>] warn_slowpath_common+0x7f/0xc0 [ 88.392122] [<ffffffff81058b5a>] warn_slowpath_null+0x1a/0x20 [ 88.393294] [<ffffffff814283de>] queue_release_one_tty+0x6e/0x70 [ 88.394486] [<ffffffff81428408>] tty_kref_put+0x28/0x30 [ 88.395644] [<ffffffff8142848c>] release_tty+0x7c/0xe0 [ 88.396814] [<ffffffff8142b76b>] tty_release+0x46b/0x4d0 [ 88.397904] [<ffffffff81194f2e>] __fput+0xae/0x230 [ 88.398969] [<ffffffff811950be>] ____fput+0xe/0x10 [ 88.399994] [<ffffffff8107aa38>] task_work_run+0xc8/0xf0 [ 88.401060] [<ffffffff8105ea06>] do_exit+0x196/0x4b0 [ 88.402059] [<ffffffff8105eee4>] do_group_exit+0x44/0xa0 [ 88.403063] [<ffffffff8106f3ed>] get_signal_to_deliver+0x20d/0x4e0 [ 88.404120] [<ffffffff81016979>] do_signal+0x29/0x130 [ 88.405116] [<ffffffff81431a2e>] ? tty_ldisc_deref+0xe/0x10 [ 88.406116] [<ffffffff8142a2e7>] ? tty_write+0xb7/0xf0 [ 88.407116] [<ffffffff81193dc3>] ? vfs_write+0xb3/0x180 [ 88.408143] [<ffffffff81016b20>] do_notify_resume+0x80/0xc0 [ 88.409118] [<ffffffff816f863c>] retint_signal+0x48/0x8c [ 88.410092] ---[ end trace 453a7c9f38fbfec1 ]--- --- >% --- /* * pty_thrash.c * * Based on original test jig by Ilya Zykov <i...@ilyx.ru> */ #include <stdio.h> #include <fcntl.h> #include <sys/ioctl.h> #include <termios.h> #include <stdlib.h> #include <errno.h> #include <string.h> #include <stdarg.h> #include <signal.h> #define parent child_id static int fd; static void error_exit(char *f, ...) { va_list va; va_start(va, f); vprintf(f, va); printf(": %s\n", strerror(errno)); va_end(va); if (fd >= 0) close(fd); exit(EXIT_FAILURE); } int main(int argc, char *argv[]) { int parent; char pts_name[24]; int ptn, unlock; while (1) { fd = open("/dev/ptmx", O_RDWR); if (fd < 0) error_exit("opening pty master"); unlock = 0; if (ioctl(fd, TIOCSPTLCK, &unlock) < 0) error_exit("unlocking pty pair"); if (ioctl(fd, TIOCGPTN, &ptn) < 0) error_exit("getting pty #"); snprintf(pts_name, sizeof(pts_name), "/dev/pts/%d", ptn); child_id = fork(); if (child_id == -1) error_exit("forking child"); if (parent) { int err, id, status; char buf[128]; int n; n = read(fd, buf, sizeof(buf)); if (n < 0) error_exit("master reading"); printf("%.*s\n", n-1, buf); close(fd); err = kill(child_id, SIGKILL); if (err < 0) error_exit("killing child"); id = waitpid(child_id, &status, 0); if (id < 0 || id != child_id) error_exit("waiting for child"); } else { /* Child */ close(fd); printf("Test cycle on slave pty %s\n", pts_name); fd = open(pts_name, O_RDWR); if (fd < 0) error_exit("opening pty slave"); while (1) { char pattern[] = "test\n"; if (write(fd, pattern, strlen(pattern)) < 0) error_exit("slave writing"); } } } /* never gets here */ return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/