Re: Getting spurious FAILS in testsuite?

Georg-Johann Lay Thu, 13 Jul 2017 07:32:33 -0700

On 12.07.2017 15:40, Bernd Edlinger wrote:

On 07/11/17 22:28, Bernd Edlinger wrote:

On 07/11/17 21:42, Andrew Pinski wrote:

On Tue, Jul 11, 2017 at 12:31 PM, Andrew Pinski <pins...@gmail.com>
wrote:

On Tue, Jul 11, 2017 at 12:27 PM, Andrew Pinski <pins...@gmail.com>
wrote:

On Tue, Jul 11, 2017 at 12:15 PM, Bernd Edlinger
<bernd.edlin...@hotmail.de> wrote:

Hi,


I see this now as well on Ubuntu 16.04, but I doubt that the Kernel is
to blame.


I don't see these failures when I use a 4.11 kernel.  Only with a
4.4 kernel.
Also the guality testsuite does not run at all with a 4.4 kernel, it
does run when using a 4.11 kernel; I suspect this is the same symptom
of the bug.



4.11 kernel results (with CentOS 7.4 glibc):
https://gcc.gnu.org/ml/gcc-testresults/2017-07/msg00998.html

4.4 kernel results (with ubuntu 1604 glibc):
https://gcc.gnu.org/ml/gcc-testresults/2017-07/msg01009.html

Note the glibc does not matter here as I get a similar testsuite
results as the CentOS 7.4 glibc using the Ubuntu 1604 glibc but with
the 4.11 kernel.

Also note I get a similar results with a plain 4.11 kernel compared to
the above kernel.



Maybe commit 6a7e6f78c235975cc14d4e141fa088afffe7062c  or
0f40fbbcc34e093255a2b2d70b6b0fb48c3f39aa to the kernel fixes both
issues.


As Georg-Johann points out here:
https://gcc.gnu.org/ml/gcc/2017-07/msg00028.html
updating the kernel alone does not fix the issue.

These patches do all circle around pty and tty devices,
but in the strace file I see a pipe(2) command
creating the fileno 10 and the sequence of events is
as follows:

pipe([7, 10]) = 0   (not in my previous e-mail)
clone() = 16141
clone() = 16142
write(4, "spawn: returns {0}\r\n", 20)  = 20
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=16141,
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=16142,
read(10,...,4096) = 4096
read(10,...,4096) = 2144
read(10, "", 4096) = 0
close(10) = 0
wait4(16141, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 16141
wait4(16142, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 16142

All data arrive at the expect process, but apparently too quickly.


Thanks
Bernd.




Oh, I think the true reason is this patch:
Author: Upstream
Description: Report and fix both by Nils Carlson.
   Replaced a cc==0 check with proper Tcl_Eof() check.
Last-Modified: Fri, 23 Oct 2015 11:09:07 +0300
Bug: http://sourceforge.net/p/expect/patches/18/
Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=799301

--- a/expect.c
+++ b/expect.c
@@ -1860,19 +1860,11 @@
       if (cc == EXP_DATA_NEW) {
        /* try to read it */
        cc = expIRead(interp,esPtr,timeout,tcl_set_flags);
-       
-       /* the meaning of 0 from i_read means eof.  Muck with it a */
-       /* little, so that from now on it means "no new data arrived */
-       /* but it should be looked at again anyway". */
-       if (cc == 0) {
+
+       if (Tcl_Eof(esPtr->channel)) {
            cc = EXP_EOF;
-       } else if (cc > 0) {
-           /* successfully read data */
-       } else {
-           /* failed to read data - some sort of error was encountered such as
-            * an interrupt with that forced an error return
-            */
        }
+
       } else if (cc == EXP_DATA_OLD) {
        cc = 0;
       } else if (cc == EXP_RECONFIGURE) {


The correct way to fix this would be something like this:

      if (cc == 0 && Tcl_Eof(esPtr->channel)) {
          cc = EXP_EOF;
      }

What happens for me is cc > 0 AND Tcl_Eof at the same time.
That makes dejagnu ignore the last few lines, because it does
not expect EOF and data at the same time.

Apparently tcl starts to read ahead because the default match
buffer size is unreasonably small like 2000 bytes.

I can work around it by increasing the buffer size like this:

ndex: gcc/testsuite/lib/gcc-dg.exp
===================================================================
--- gcc/testsuite/lib/gcc-dg.exp        (revision 250150)
+++ gcc/testsuite/lib/gcc-dg.exp        (working copy)
@@ -43,6 +43,9 @@
     setenv LANG C.ASCII
   }

+# Avoid sporadic data-losses with expect
+match_max -d 10000
+
   # Ensure GCC_COLORS is unset, for the rare testcases that verify
   # how output is colorized.
   if [info exists ::env(GCC_COLORS) ] {


What do you think?
Can debian/ubuntu people reproduce and fix this?
Should we increase the match buffer size on the gcc side?

Thanks
Bernd.


Sigh, I am still getting these spurious fails.

$ sudo apt-get install tcl-expect
Reading package lists... Done
Building dependency tree
Reading state information... Done
tcl-expect is already the newest version (5.45-7).
...

So I have "5.45-7"

$ apt-get changelog tcl-expect
expect (5.45-7) unstable; urgency=medium

  * Applied a few fixes by upstream which were included in Expect 5.45.3
    never released as a tarball (closes: #799301).
  * Bumped standards version to 3.9.6.

 -- Sergei Golovan <sgolo...@debian.org>  Fri, 23 Oct 2015 11:27:59 +0300
...

$ expect -v
expect version 5.45

Not showing the bugfix level as of configure:PACKAGE_VERSION='5.45'.

I also tried building "expect" from sources, but this is throwing somuch warnings that I really don't want to install the outcome. Somewarnings I can fix by #include "tclInt.h" in expect_cf.h. Other, less

important warnings can be silenced by

$ make CFLAGS_WARNING='-Wno-switch -Wno-unused-result-Wno-format-security ...'But still many warnings, casting pointers against int of different size,etc. All sources appear to be from a project abandoned decades ago...

Anyway, as the installed version comes with #799301, there must be atleast one other bug somewhere...


Johann


Just for completeness:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.2 LTS
Release:        16.04
Codename:       xenial

$uname -a

Linux pandora 4.11.2-041102-generic #201705201036 SMP Sat May 2014:38:21 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Re: Getting spurious FAILS in testsuite?

Reply via email to