On Sat, Oct 08, 2011 at 02:45:27PM +0200, Axel Beckert wrote: > Hi, > > I'm currently preparing an upload of the current screen HEAD to Debian > (either Unstable or Experimental, not yet decided) and I noticed that, > if I have a screen 4.0.3 (as currently in Debian Stable/Unstable) is > running, I can't reattach to that with the new screen 4.1.0 snapshot, > it just hangs until I kill it with "kill -TERM" as Ctrl-C does not > help. > > As for me the common way to dist-upgrade Debian or Ubuntu boxes > (especially remote servers) is to run the whole process inside a > screen, this is a quite critical issue. So I wonder: > > Is this issue known? Not circumventable? An unexpected bug? Anyone has > an idea where this comes from? Or does it not happen at all with vanilla > screen versions?
I noticed the message while upgrading screen version sid → experimental, thanks for making it loud. > I tried to strace "screen -r" to find any differences, but > interestingly it behaves differently when being traced (bails out with > claiming /var/run/screen has wrong permissions or so) than when not. This is because screen is setgid utmp: % ls -l =screen -rwxr-sr-x 1 root utmp 414432 Oct 9 05:02 /usr/bin/screen* strace() will defuse setuid/setgid so it won't work as expected, but we can always try as root: # aptitude install screen=4.0.3-14 # screen -dmS test # strace -o /tmp/log_4.0.3 screen -r test [... works, I detach ...] # aptitude install screen=4.1.0~20110819git450e8f3-1 # strace -o /tmp/log_4.1.0 screen -r test [... hangs, gets killed ...] The difference I see at the end is: 4.0.3 → write(4, "\0gsm\2\0\0\0/dev/pts/46\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 12336) = 12336 4.1.0 → write(4, "\2gsm\2\0\0\0/dev/pts/46\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 12376) = 12376 The first byte was \0, now it is \2. Is tracing the background process perhaps more interesting? # aptitude install screen=4.0.3-14 # strace -t -o /tmp/server_log_4.0.3 -p 26322 [... works, I detach, then ^C this strace ...] # aptitude install screen=4.1.0~20110819git450e8f3-1 # strace -t -o /tmp/server_log_4.1.0 -p 26322 [... hangs, I kill client, then ^C this strace ...] Differences here: 4.0.3: 0.000083 read(3, "\0gsm\2\0\0\0/dev/pts/46\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 12336) = 12336 0.000125 close(3) = 0 0.000075 geteuid() = 0 0.000067 getegid() = 43 0.000067 getgid() = 0 0.000068 setregid(43, 0) = 0 0.000077 open("/var/run/screen/S-root/26322.test", O_RDONLY|O_NONBLOCK) = 3 0.000106 geteuid() = 0 0.000068 getegid() = 0 0.000067 getgid() = 43 0.000069 setregid(0, 43) = 0 0.000074 kill(26948, SIG_0) = 0 0.000074 geteuid() = 0 0.000067 getegid() = 43 0.000067 getgid() = 0 0.000068 setregid(43, 0) = 0 0.000072 open("/dev/pts/46", O_RDWR|O_NONBLOCK) = 5 [... goes on and on ...] 4.1.0: 0.000087 read(3, "\2gsm\2\0\0\0/dev/pts/46\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 12336) = 12336 0.000123 close(3) = 0 0.000079 geteuid() = 0 0.000073 getegid() = 43 0.000072 getgid() = 0 0.000074 setregid(43, 0) = 0 0.000081 open("/var/run/screen/S-root/26322.test", O_RDONLY|O_NONBLOCK) = 3 0.000112 geteuid() = 0 0.000073 getegid() = 0 0.000072 getgid() = 43 0.000074 setregid(0, 43) = 0 0.000089 select(1024, [3 4], [], NULL, NULL) = ? ERESTARTNOHAND (To be restarted) 5.869203 --- SIGINT (Interrupt) @ 0 (0) --- So yeah, the background screen is waiting for more traffic from the new version, which it doesn't get. The screen.git repo has no tags. The experimental package has a commit SHA1 in its version string, which helps. How do I find out exactly which commit corresponds to "Screen version 4.00.03jw4 (FAU) 2-May-06" ? I'd like to bisect the issue, and knowing where to start would be a good start :) -- Fernando Vezzosi perl -E'$_="Pop Corn",s/\b\S(?{$0=$&^_^L})/$0/g,say' qw(MDAx MTAw MDEw MDEx MDAw _5 MTEw _6 _1 _5 _5 _4 _2 _2 _6 MTEx _1 _5 _5 _4 _5 _2 _6 _1 _1 _2 _2 _3 _5 _5 _6 _1 _1 _2 _5 _4 _3 _5 _2 _5 _1 _2 _3 _4 _5 MA==)