RFC: Monitoring old PRs, new dg directives

Marek Polacek via Gcc-patches Tue, 28 Jul 2020 14:45:19 -0700

In Bugzilla, for the c++ component, we currently have over 3200 open bugs.  In
my experience, a good amount of them have already been fixed; my periodical
sweeps always turn up a bunch of PRs that had already been fixed previously.
Sometimes my sweeps are more or less random, but more often than not I'm just
looking for duplicates of an existing PR.  Sometimes the reason the already
fixed PRs are still open is because a PR that was fixed had duplicates that we
didn't catch earlier when confirming the PR.  Sometimes a PR gets fixed as a
side-effect of fixing another PR.  Manual sweeps are tedious and time-consuming
because often you need to grab the test from the Bugzilla yet again (and
sometimes there are multiple tests).  Even if you find a PR that was fixed, you
still need to bisect the fix and perhaps add the test to our testsuite.  That's
draining and since the number of bugs only increases, never decreases, it is not
sustainable.


So I've started a personal repo where I've gathered dozens of tests and wrote a
script that just compiles every test in the repo and reports if anything
changed.  One line from it:

pr=59798; $cxx $o -c $pr.C 2>&1 | grep -qE 'internal compiler error' || echo -e 
"$pr: ${msg_ice}"

This has major drawbacks: you have to remember to run this manually, keep
updating it, and it's yet another repo that people interested in this would
have to clone, but the worst thing is that typically you would only discover
that a patch fixed a different PR long after the patch was committed.  And
quite likely it wasn't even your patch.  We know that finding problems earlier
in the developer workflow reduces costs; if we can catch this before the
original developer commits & pushes the changes, it's cheaper, because the
developer already understands what the patch does.

A case in point: https://gcc.gnu.org/PR58156 which has been fixed recently
by an unrelated (?) patch.  Knowing that the tsubst_pack_expansion hunk in
the patch had this effect would probably have been very useful.  More testing
will lead to a better compiler.

Another case: https://gcc.gnu.org/35098 which was fixed 12 years (!) after
it was reported by a different change.

Or another: https://gcc.gnu.org/91525 where the patch contained a test, but
that was ice-on-invalid, whereas the test in PR91525 was ice-on-valid.

To alleviate some of these problems, I propose that we introduce a means to our
DejaGNU infrastructure that allows adding tests for old bugs that have not been
fixed yet, and re-introduce the keyword monitored (no longer used for anything
-- I think Volker stepped away) to the GCC Bugzilla to signal that a PR is
tracked in the testsuite.  I don't want any unnecessary moving tests around, so
the tests would go where they would normally go; they have to be reduced and
have proper targets, etc.  Having such tests in the testsuite means that when
something changes, you will know immediately, before you push any changes.

My thinking is that for:

* rejects-valid: use the existing dg-xfail-if
* accepts-valid: use the new dg-accepts-invalid
* ICEs: use the new dg-ice

dg-ice can be used like this:

// { dg-ice "build_over_call" { target c++11 } }

and it means that if the test still ICEs, you'll get a quiet XFAIL.  If the
ICE is fixed, you'll get an XPASS; if the ICE is gone but there are errors,
you'll get an XPASS + FAIL.  Then you can close the old PR.

Similarly, dg-accepts-invalid:

// { dg-accepts-invalid "PR86500" }

means that if the test still compiles without errors, you'll get a quiet XFAIL.
If we start giving errors, you'll get an XPASS.

If the bug is fixed, simply remove the directive.

The patch implementing these new directives is appended.  Once/if this is
accepted, I can start adding the old tests we have in our Bugzilla.  (I'm
only concerned about the c++ component, if that wasn't already clear.)

The question is what makes the bug "old": is it one year without it being
assigned?  6 months?  3 months?  Note: I *don't* propose to add every test for
every new PR, just the reasonably old ones that are useful/important.  Such
additions should be done in batches, so that we don't have dozens of commits,
each of them merely adding a single test.

We will still have a surfeit of bugs that we've given short shrift to, but
let's at least automate what we can.  The initial addition of the relevant
old(-ish) tests won't of course happen automagically, but it's a price I'm
willing to pay.  My goal here isn't merely to reduce the number of open PRs;
it is to improve the testing of the compiler overall.

Thoughts?

Bootstrapped/regtested on x86_64-pc-linux-gnu.

[PATCH] testsuite: Introduce dg-ice and dg-accepts-invalid.

gcc/ChangeLog:

        * doc/sourcebuild.texi: Document dg-ice and dg-accepts-invalid.

gcc/testsuite/ChangeLog:

        * lib/gcc-dg.exp (gcc-dg-test-1): Handle dg-ice and dg-accepts-invalid.
        * lib/prune.exp (prune_ices): New.
        * lib/target-supports-dg.exp (dg-accepts-invalid, dg-ice): New.
---
 gcc/doc/sourcebuild.texi                 | 19 +++++++
 gcc/testsuite/lib/gcc-dg.exp             | 39 +++++++++++++-
 gcc/testsuite/lib/prune.exp              |  9 ++++
 gcc/testsuite/lib/target-supports-dg.exp | 69 ++++++++++++++++++++++++
 4 files changed, 134 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index a7a922d84a2..636d21d30dd 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1172,6 +1172,16 @@ Expect the execute step of a test to fail if the 
conditions (which are
 the same as for @code{dg-skip-if}) are met.
 @end table
 
+@subsubsection Expect the compiler to crash
+
+@table @code
+@item  @{ dg-ice @var{comment} [@{ @var{selector} @} [@{ @var{include-opts} @} 
[@{ @var{exclude-opts} @}]]] @}
+Expect the compiler to crash with an internal compiler error and return
+a nonzero exit status if the conditions (which are the same as for
+@code{dg-skip-if}) are met.  Used for tests that test bugs that have not been
+fixed yet.
+@end table
+
 @subsubsection Expect the test executable to fail
 
 @table @code
@@ -1234,6 +1244,15 @@ has the same effect as @samp{target}.
 
 @item @{ dg-prune-output @var{regexp} @}
 Prune messages matching @var{regexp} from the test output.
+
+@table @code
+@item  @{ dg-accepts-invalid @var{comment} [@{ @var{selector} @} [@{ 
@var{include-opts} @} [@{ @var{exclude-opts} @}]]] @}
+Expect the compiler to accept the test (even though it should be rejected with
+a compile-time error), if the conditions (which are the same as for
+@code{dg-skip-if}) are met.  Used for tests that test bugs that have not been
+fixed yet.
+@end table
+
 @end table
 
 @subsubsection Verify output of the test executable
diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index 45d97024883..6478eda283b 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -308,13 +308,48 @@ proc gcc-dg-test-1 { target_compile prog do_what 
extra_tool_flags } {
     verbose "$target_compile $prog $output_file $compile_type $options" 4
     set comp_output [$target_compile "$prog" "$output_file" "$compile_type" 
$options]
 
+    global expect_ice
     # Look for an internal compiler error, which sometimes masks the fact
     # that we didn't get an expected error message.  XFAIL an ICE via
     # dg-xfail-if and use { dg-prune-output ".*internal compiler error.*" }
     # to avoid a second failure for excess errors.
-    if [string match "*internal compiler error*" $comp_output] {
+    # "Error reporting routines re-entered" ICE says "Internal" rather than
+    # "internal", so match that too.
+    if [string match {*[Ii]nternal compiler error*} $comp_output] {
        upvar 2 name name
-       fail "$name (internal compiler error)"
+       if { $expect_ice == 0 } {
+         fail "$name (internal compiler error)"
+       } else {
+         # We expected an ICE and we got it.  Emit an XFAIL.
+         setup_xfail "*-*-*"
+         fail "$name (internal compiler error)"
+         clear_xfail "*-*-*"
+         # Prune the ICE from the output.
+         set comp_output [prune_ices $comp_output]
+       }
+    } else {
+       upvar 2 name name
+       global accepts_invalid
+       if { $expect_ice == 1 } {
+         # We expected an ICE but we didn't get it.  We want an XPASS, so
+         # call setup_xfail to set xfail_flag.
+         setup_xfail "*-*-*"
+         pass "$name (internal compiler error)"
+         clear_xfail "*-*-*"
+       } elseif { $accepts_invalid == 1 } {
+           if [string match {*error: *} $comp_output] {
+             # We expected that this test be (wrongly) accepted, but now we 
have
+             # seen error(s).  Issue an XPASS to signal that.
+             setup_xfail "*-*-*"
+             pass "$name (accepts invalid)"
+             clear_xfail "*-*-*"
+           } else {
+             # This test is still (wrongly) accepted.  Just emit an XFAIL.
+             setup_xfail "*-*-*"
+             fail "$name (accepts invalid)"
+             clear_xfail "*-*-*"
+           }
+       }
     }
 
     if { $do_what == "repo" } {
diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp
index 1c776249f1a..58a739684a5 100644
--- a/gcc/testsuite/lib/prune.exp
+++ b/gcc/testsuite/lib/prune.exp
@@ -118,6 +118,15 @@ proc prune_file_path { text } {
     return $text
 }
 
+# Prune internal compiler error messages, including the "Please submit..."
+# footnote.
+
+proc prune_ices { text } {
+  regsub -all "(^|\n)\[^\n\]*: internal compiler error:.*for 
instructions\[^\n\]*" $text "" text
+  regsub -all "(^|\n|')*Internal compiler error:.*for instructions\[^\n\]*" 
$text "" text
+  return $text
+}
+
 # Provide a definition of this if missing (delete after next dejagnu release).
 
 if { [info procs prune_warnings] == "" } then {
diff --git a/gcc/testsuite/lib/target-supports-dg.exp 
b/gcc/testsuite/lib/target-supports-dg.exp
index 2a21424b890..765f3a2e27a 100644
--- a/gcc/testsuite/lib/target-supports-dg.exp
+++ b/gcc/testsuite/lib/target-supports-dg.exp
@@ -495,6 +495,75 @@ proc dg-shouldfail { args } {
     }
 }
 
+# Record whether the compiler is expected (at the moment) to ICE.
+# Used for tests that test bugs that have not been fixed yet.
+
+set expect_ice 0
+set accepts_invalid 0
+
+proc dg-ice { args } {
+    # Don't bother if we're already skipping the test.
+    upvar dg-do-what dg-do-what
+    if { [lindex ${dg-do-what} 1] == "N" } {
+      return
+    }
+
+    global accepts_invalid
+    # Can't be combined with dg-accepts-invalid.
+    if { $accepts_invalid == 1 } {
+      error "dg-ice: cannot be combined with dg-accepts-invalid"
+      return
+    }
+
+    global expect_ice
+
+    set args [lreplace $args 0 0]
+    if { [llength $args] > 1 } {
+       set selector [list target [lindex $args 1]]
+       if { [dg-process-target-1 $selector] == "S" } {
+           # The target matches, now check the flags.
+           if [check-flags $args] {
+               set expect_ice 1
+           }
+       }
+    } else {
+       set expect_ice 1
+    }
+}
+
+# Record whether the compiler should reject the testcase with an error,
+# but currently doesn't do so.  Used for accepts-invalid bugs.
+
+proc dg-accepts-invalid { args } {
+    # Don't bother if we're already skipping the test.
+    upvar dg-do-what dg-do-what
+    if { [lindex ${dg-do-what} 1] == "N" } {
+      return
+    }
+
+    global expect_ice
+    # Can't be combined with dg-ice.
+    if { $expect_ice == 1 } {
+      error "dg-accepts-invalid: cannot be combined with dg-ice"
+      return
+    }
+
+    global accepts_invalid
+
+    set args [lreplace $args 0 0]
+    if { [llength $args] > 1 } {
+       set selector [list target [lindex $args 1]]
+       if { [dg-process-target-1 $selector] == "S" } {
+           # The target matches, now check the flags.
+           if [check-flags $args] {
+               set accepts_invalid 1
+           }
+       }
+    } else {
+       set accepts_invalid 1
+    }
+}
+
 # Intercept the call to the DejaGnu version of dg-process-target to
 # support use of an effective-target keyword in place of a list of
 # target triplets to xfail or skip a test.

base-commit: f3665bd1111c1799c0421490b5e655f977570354
-- 
2.26.2

RFC: Monitoring old PRs, new dg directives

Reply via email to