On 07/11/17 22:28, Bernd Edlinger wrote: > On 07/11/17 21:42, Andrew Pinski wrote: >> On Tue, Jul 11, 2017 at 12:31 PM, Andrew Pinski <pins...@gmail.com> >> wrote: >>> On Tue, Jul 11, 2017 at 12:27 PM, Andrew Pinski <pins...@gmail.com> >>> wrote: >>>> On Tue, Jul 11, 2017 at 12:15 PM, Bernd Edlinger >>>> <bernd.edlin...@hotmail.de> wrote: >>>>> Hi, >>>>> >>>>> I see this now as well on Ubuntu 16.04, but I doubt that the Kernel is >>>>> to blame. >>>> >>>> I don't see these failures when I use a 4.11 kernel. Only with a >>>> 4.4 kernel. >>>> Also the guality testsuite does not run at all with a 4.4 kernel, it >>>> does run when using a 4.11 kernel; I suspect this is the same symptom >>>> of the bug. >>> >>> >>> 4.11 kernel results (with CentOS 7.4 glibc): >>> https://gcc.gnu.org/ml/gcc-testresults/2017-07/msg00998.html >>> >>> 4.4 kernel results (with ubuntu 1604 glibc): >>> https://gcc.gnu.org/ml/gcc-testresults/2017-07/msg01009.html >>> >>> Note the glibc does not matter here as I get a similar testsuite >>> results as the CentOS 7.4 glibc using the Ubuntu 1604 glibc but with >>> the 4.11 kernel. >>> >>> Also note I get a similar results with a plain 4.11 kernel compared to >>> the above kernel. >> >> >> Maybe commit 6a7e6f78c235975cc14d4e141fa088afffe7062c or >> 0f40fbbcc34e093255a2b2d70b6b0fb48c3f39aa to the kernel fixes both >> issues. >> > > As Georg-Johann points out here: > https://gcc.gnu.org/ml/gcc/2017-07/msg00028.html > updating the kernel alone does not fix the issue. > > These patches do all circle around pty and tty devices, > but in the strace file I see a pipe(2) command > creating the fileno 10 and the sequence of events is > as follows: > > pipe([7, 10]) = 0 (not in my previous e-mail) > clone() = 16141 > clone() = 16142 > write(4, "spawn: returns {0}\r\n", 20) = 20 > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=16141, > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=16142, > read(10,...,4096) = 4096 > read(10,...,4096) = 2144 > read(10, "", 4096) = 0 > close(10) = 0 > wait4(16141, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 16141 > wait4(16142, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 16142 > > All data arrive at the expect process, but apparently too quickly. > > > Thanks > Bernd.
Oh, I think the true reason is this patch: Author: Upstream Description: Report and fix both by Nils Carlson. Replaced a cc==0 check with proper Tcl_Eof() check. Last-Modified: Fri, 23 Oct 2015 11:09:07 +0300 Bug: http://sourceforge.net/p/expect/patches/18/ Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=799301 --- a/expect.c +++ b/expect.c @@ -1860,19 +1860,11 @@ if (cc == EXP_DATA_NEW) { /* try to read it */ cc = expIRead(interp,esPtr,timeout,tcl_set_flags); - - /* the meaning of 0 from i_read means eof. Muck with it a */ - /* little, so that from now on it means "no new data arrived */ - /* but it should be looked at again anyway". */ - if (cc == 0) { + + if (Tcl_Eof(esPtr->channel)) { cc = EXP_EOF; - } else if (cc > 0) { - /* successfully read data */ - } else { - /* failed to read data - some sort of error was encountered such as - * an interrupt with that forced an error return - */ } + } else if (cc == EXP_DATA_OLD) { cc = 0; } else if (cc == EXP_RECONFIGURE) { The correct way to fix this would be something like this: if (cc == 0 && Tcl_Eof(esPtr->channel)) { cc = EXP_EOF; } What happens for me is cc > 0 AND Tcl_Eof at the same time. That makes dejagnu ignore the last few lines, because it does not expect EOF and data at the same time. Apparently tcl starts to read ahead because the default match buffer size is unreasonably small like 2000 bytes. I can work around it by increasing the buffer size like this: ndex: gcc/testsuite/lib/gcc-dg.exp =================================================================== --- gcc/testsuite/lib/gcc-dg.exp (revision 250150) +++ gcc/testsuite/lib/gcc-dg.exp (working copy) @@ -43,6 +43,9 @@ setenv LANG C.ASCII } +# Avoid sporadic data-losses with expect +match_max -d 10000 + # Ensure GCC_COLORS is unset, for the rare testcases that verify # how output is colorized. if [info exists ::env(GCC_COLORS) ] { What do you think? Can debian/ubuntu people reproduce and fix this? Should we increase the match buffer size on the gcc side? Thanks Bernd.