Hi Bernd, On Fri, Jul 14, 2017 at 4:36 PM, Bernd Edlinger <bernd.edlin...@hotmail.de> wrote: > On 07/14/17 13:03, Georg-Johann Lay wrote: >> On 13.07.2017 18:47, Bernd Edlinger wrote: >>> On 07/13/17 16:31, Georg-Johann Lay wrote: >>>> On 12.07.2017 15:40, Bernd Edlinger wrote: >>>>> On 07/11/17 22:28, Bernd Edlinger wrote: >>>>>> On 07/11/17 21:42, Andrew Pinski wrote: >>>>>>> On Tue, Jul 11, 2017 at 12:31 PM, Andrew Pinski <pins...@gmail.com> >>>>>>> wrote: >>>>>>>> On Tue, Jul 11, 2017 at 12:27 PM, Andrew Pinski <pins...@gmail.com> >>>>>>>> wrote: >>>>>>>>> On Tue, Jul 11, 2017 at 12:15 PM, Bernd Edlinger >>>>>>>>> <bernd.edlin...@hotmail.de> wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I see this now as well on Ubuntu 16.04, but I doubt that the >>>>>>>>>> Kernel is >>>>>>>>>> to blame. >>>>>>>>> >>>>>>>>> I don't see these failures when I use a 4.11 kernel. Only with a >>>>>>>>> 4.4 kernel. >>>>>>>>> Also the guality testsuite does not run at all with a 4.4 >>>>>>>>> kernel, it >>>>>>>>> does run when using a 4.11 kernel; I suspect this is the same >>>>>>>>> symptom >>>>>>>>> of the bug. >>>>>>>> >>>>>>>> >>>>>>>> 4.11 kernel results (with CentOS 7.4 glibc): >>>>>>>> https://gcc.gnu.org/ml/gcc-testresults/2017-07/msg00998.html >>>>>>>> >>>>>>>> 4.4 kernel results (with ubuntu 1604 glibc): >>>>>>>> https://gcc.gnu.org/ml/gcc-testresults/2017-07/msg01009.html >>>>>>>> >>>>>>>> Note the glibc does not matter here as I get a similar testsuite >>>>>>>> results as the CentOS 7.4 glibc using the Ubuntu 1604 glibc but with >>>>>>>> the 4.11 kernel. >>>>>>>> >>>>>>>> Also note I get a similar results with a plain 4.11 kernel >>>>>>>> compared to >>>>>>>> the above kernel. >>>>>>> >>>>>>> >>>>>>> Maybe commit 6a7e6f78c235975cc14d4e141fa088afffe7062c or >>>>>>> 0f40fbbcc34e093255a2b2d70b6b0fb48c3f39aa to the kernel fixes both >>>>>>> issues. >>>>>>> >>>>>> >>>>>> As Georg-Johann points out here: >>>>>> https://gcc.gnu.org/ml/gcc/2017-07/msg00028.html >>>>>> updating the kernel alone does not fix the issue. >>>>>> >>>>>> These patches do all circle around pty and tty devices, >>>>>> but in the strace file I see a pipe(2) command >>>>>> creating the fileno 10 and the sequence of events is >>>>>> as follows: >>>>>> >>>>>> pipe([7, 10]) = 0 (not in my previous e-mail) >>>>>> clone() = 16141 >>>>>> clone() = 16142 >>>>>> write(4, "spawn: returns {0}\r\n", 20) = 20 >>>>>> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=16141, >>>>>> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=16142, >>>>>> read(10,...,4096) = 4096 >>>>>> read(10,...,4096) = 2144 >>>>>> read(10, "", 4096) = 0 >>>>>> close(10) = 0 >>>>>> wait4(16141, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 16141 >>>>>> wait4(16142, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 16142 >>>>>> >>>>>> All data arrive at the expect process, but apparently too quickly. >>>>>> >>>>>> >>>>>> Thanks >>>>>> Bernd. >>>>> >>>>> >>>>> >>>>> Oh, I think the true reason is this patch: >>>>> Author: Upstream >>>>> Description: Report and fix both by Nils Carlson. >>>>> Replaced a cc==0 check with proper Tcl_Eof() check. >>>>> Last-Modified: Fri, 23 Oct 2015 11:09:07 +0300 >>>>> Bug: http://sourceforge.net/p/expect/patches/18/ >>>>> Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=799301 >>>>> >>>>> --- a/expect.c >>>>> +++ b/expect.c >>>>> @@ -1860,19 +1860,11 @@ >>>>> if (cc == EXP_DATA_NEW) { >>>>> /* try to read it */ >>>>> cc = expIRead(interp,esPtr,timeout,tcl_set_flags); >>>>> - >>>>> - /* the meaning of 0 from i_read means eof. Muck with it a */ >>>>> - /* little, so that from now on it means "no new data arrived */ >>>>> - /* but it should be looked at again anyway". */ >>>>> - if (cc == 0) { >>>>> + >>>>> + if (Tcl_Eof(esPtr->channel)) { >>>>> cc = EXP_EOF; >>>>> - } else if (cc > 0) { >>>>> - /* successfully read data */ >>>>> - } else { >>>>> - /* failed to read data - some sort of error was encountered >>>>> such as >>>>> - * an interrupt with that forced an error return >>>>> - */ >>>>> } >>>>> + >>>>> } else if (cc == EXP_DATA_OLD) { >>>>> cc = 0; >>>>> } else if (cc == EXP_RECONFIGURE) { >>>>> >>>>> >>>>> The correct way to fix this would be something like this: >>>>> >>>>> if (cc == 0 && Tcl_Eof(esPtr->channel)) { >>>>> cc = EXP_EOF; >>>>> } >>>>> >>>>> What happens for me is cc > 0 AND Tcl_Eof at the same time. >>>>> That makes dejagnu ignore the last few lines, because it does >>>>> not expect EOF and data at the same time. >>>>> >>>>> Apparently tcl starts to read ahead because the default match >>>>> buffer size is unreasonably small like 2000 bytes. >>>>> >>>>> I can work around it by increasing the buffer size like this: >>>>> >>>>> ndex: gcc/testsuite/lib/gcc-dg.exp >>>>> =================================================================== >>>>> --- gcc/testsuite/lib/gcc-dg.exp (revision 250150) >>>>> +++ gcc/testsuite/lib/gcc-dg.exp (working copy) >>>>> @@ -43,6 +43,9 @@ >>>>> setenv LANG C.ASCII >>>>> } >>>>> >>>>> +# Avoid sporadic data-losses with expect >>>>> +match_max -d 10000 >>>>> + >>>>> # Ensure GCC_COLORS is unset, for the rare testcases that verify >>>>> # how output is colorized. >>>>> if [info exists ::env(GCC_COLORS) ] { >>>>> >>>>> >>>>> What do you think? >>>>> Can debian/ubuntu people reproduce and fix this? >>>>> Should we increase the match buffer size on the gcc side? >>>>> >>>>> Thanks >>>>> Bernd. >>>> >>>> Sigh, I am still getting these spurious fails. >>>> >>>> $ sudo apt-get install tcl-expect >>>> Reading package lists... Done >>>> Building dependency tree >>>> Reading state information... Done >>>> tcl-expect is already the newest version (5.45-7). >>>> ... >>>> >>>> So I have "5.45-7" >>>> >>>> $ apt-get changelog tcl-expect >>>> expect (5.45-7) unstable; urgency=medium >>>> >>>> * Applied a few fixes by upstream which were included in Expect >>>> 5.45.3 >>>> never released as a tarball (closes: #799301). >>>> * Bumped standards version to 3.9.6. >>>> >>>> -- Sergei Golovan <sgolo...@debian.org> Fri, 23 Oct 2015 11:27:59 >>>> +0300 >>>> ... >>>> >>>> $ expect -v >>>> expect version 5.45 >>>> >>>> Not showing the bugfix level as of configure:PACKAGE_VERSION='5.45'. >>>> >>>> I also tried building "expect" from sources, but this is throwing so >>>> much warnings that I really don't want to install the outcome. Some >>>> warnings I can fix by #include "tclInt.h" in expect_cf.h. Other, less >>>> important warnings can be silenced by >>>> $ make CFLAGS_WARNING='-Wno-switch -Wno-unused-result >>>> -Wno-format-security ...' >>>> But still many warnings, casting pointers against int of different size, >>>> etc. All sources appear to be from a project abandoned decades ago... >>>> > > > Just for the records, here is how I bould the expect package out of the > souces for testing without actually modifying the installation: > > > apt-get source tcl8.6 > mkdir tcl-build > cd tcl-build > ../tcl8.6-8.6.5+dfsg/unix/configure --prefix=$(cd ..; echo > $PWD)/test-install > make > make install > make install-private-headers > cd .. > apt-get source expect > mkdir expect-build > cd expect-build > ../expect-5.45/configure --prefix=$(cd ..; echo $PWD)/test-install > make > make install > cd .. > PATH=$PWD/$1/bin:$PATH > export LD_LIBRARY_PATH=$PWD/$1/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH} > cd gcc-build > make check-gcc-c -k -j8 > > > And, as expected the bug goes away if I install the following patch: > > diff -Nur expect-5.45.orig/expect.c expect-5.45/expect.c > --- expect-5.45.orig/expect.c 2017-07-14 09:17:01.000000000 +0200 > +++ expect-5.45/expect.c 2017-07-14 08:57:56.595575429 +0200 > @@ -1859,7 +1859,7 @@ > /* try to read it */ > cc = expIRead(interp,esPtr,timeout,tcl_set_flags); > > - if (Tcl_Eof(esPtr->channel)) { > + if (cc == 0 && Tcl_Eof(esPtr->channel)) { > cc = EXP_EOF; > } > > > Adding debian package maintainer in CC: maybe consider above patch > for inclusion?
Thank you all for the analysis and proposed fix. I've uploaded the new package to Debian unstable (5.45-8), and opened a bugreport upstream (https://sourceforge.net/p/expect/patches/21/). > > Thanks > Bernd. > >>>> Anyway, as the installed version comes with #799301, there must be at >>>> least one other bug somewhere... >>>> >>>> Johann >>>> >>>> >>>> Just for completeness: >>>> >>>> $ lsb_release -a >>>> No LSB modules are available. >>>> Distributor ID: Ubuntu >>>> Description: Ubuntu 16.04.2 LTS >>>> Release: 16.04 >>>> Codename: xenial >>>> >>>> $uname -a >>>> Linux pandora 4.11.2-041102-generic #201705201036 SMP Sat May 20 >>>> 14:38:21 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux >>>> >>>> >>> >>> I am pretty sure that this bugfix should fix something with unicode >>> characters which we don't need because we have LC_ALL=C everywhere, >>> but as it looks like, it introduced a new more severe problem. >> >> Ah, thanks. I didn't read your previous post careful enough. My bad. >> >> IIUC #799301 /introduces/ new problems that might be relevant here. >> >>> The bug goes away if I use expect_5.45-5ubuntu1_amd64.deb from 14.04 >>> and extract it on 16.04 with the following command: >>> >>> dpkg-deb -x expect_5.45-5ubuntu1_amd64.deb test >>> PATH=$PWD/test/bin:$PATH >>> export LD_LIBRARY_PATH=$PWD/test/lib >>> >>> I cannot install the 16.04 expect on 14.04 because it depends on tcl8.6 >>> while 14.04 only has tcl8.5. >>> >>> But the bug also goes away if I used the following patch. >>> >>> So did you try this already: >>> >>> Index: gcc/testsuite/lib/gcc-dg.exp >>> =================================================================== >>> --- gcc/testsuite/lib/gcc-dg.exp (revision 250150) >>> +++ gcc/testsuite/lib/gcc-dg.exp (working copy) >>> @@ -43,6 +43,9 @@ >>> setenv LANG C.ASCII >>> } >>> >>> +# Avoid sporadic data-losses with expect >>> +match_max -d 10000 >>> + >>> # Ensure GCC_COLORS is unset, for the rare testcases that verify >>> # how output is colorized. >>> if [info exists ::env(GCC_COLORS) ] { >>> >>> Thanks >>> Bernd. >> >> Just tried it (with ubsan.exp only) and it appears to work. At least I >> didn't see any FAILs with "match_max -d 10000". >> >> Johann >> -- Sergei Golovan