On 10/5/20 11:28 AM, David Malcolm via Gcc-patches wrote:
On Mon, 2020-10-05 at 11:51 +0200, Aldy Hernandez via Gcc-patches
wrote:
The walloca pass is a mess.  It has all sorts of heuristics to
divine
problematic ranges fed into alloca, none of them very good, and all
of
them unreadable.  The mess therein was actually one of the original
motivators for the ranger project (along with array bounds checking).

The attached patch is a conversion of the pass to ranger.  It's
mostly
an exercise in removing code.  The entire pass almost reduces to:

+  // If the user specified a limit, use it.
+  int_range_max r;
+  if (warn_limit_specified_p (is_vla)
+      && TREE_CODE (len) == SSA_NAME
+      && query.range_of_expr (r, len, stmt)
+      && !r.varying_p ())
+    {
+      // The invalid bits are anything outside of [0, MAX_SIZE].
+      static int_range<2> invalid_range (build_int_cst
(size_type_node, 0),
+                                        build_int_cst
(size_type_node,
+                                                       max_size),
+                                        VR_ANTI_RANGE);
+
+      r.intersect (invalid_range);
+      if (r.undefined_p ())
+       return alloca_type_and_limit (ALLOCA_OK);
+
+      return alloca_type_and_limit (ALLOCA_BOUND_MAYBE_LARGE,
+                                   wi::to_wide (integer_zero_node));
       }

That is, if the range of the integer passed to alloca is outside of
[0,MAX_SIZE], warn, otherwise it's ok.  Plain and simple.

You will notice I removed the nuanced errors we gave before-- like
trying to guess whether the problematic range came by virtue of a
signed
cast conversion.  These specific errors were never part of the
original
design, they were just stuff we could guess by how the IL
looked.  It
was non-exact and fragile.  Now we just say the alloca argument may
be
too large, period.

It the future, I would even like to remove the specific range the
ranger
was able to compute from the error message itself.  As will become
obvious, the ranger can get pretty outrageous ranges that are
entirely
non-obvious by looking at the code.  Peppering the error messages
with
these ranges will ultimately just confuse the user.  But alas, that's
a
problem for another patch to solve.
I can't comment on the content of the patch itself, but with my "user
experience hat" on I'm wondering what diagnostic messages involving
ranges ought to look like.  I worry that if we simply drop range
information altogether from the messages, we're not giving the user
enough information to make a judgment call on whether to pay attention
to the diagnostic.  That said, I'm unhappy with the status quo of our
range-based messages, so I don't object to the patch.

I also dont know what the right answer is regarding dumping of ranges.  Most of the time I suspect the error only needs the end points, not the fun stuff in the middle if its a multi-range... but I suppose thats up to the actual issuer of the message to decide if thats important.

Some possible ideas:

I added support for capturing and printing control-flow paths for
diagnostics in gcc 10; see gcc/diagnostic-path.h.  I added this for the
analyzer code, but it's usable outside of it.  It's possible to use
this to associate a chain of events with a diagnostic, by building a
diagnostic_path and associating it with a rich_location.  Perhaps the
ranger code could have a mode which captures diagnostic_paths, where
the events in the path give pertinent information about ranges - e.g.
the various conditionals that lead to a particular range.  There's a
pre-canned simple_diagnostic_path that can be populated with
simple_diagnostic_event, or you could create your own ranger-specific
event and path subclasses.  These can be temporary objects that live on
the stack next to the rich_location, just for the lifetime of emitting
the diagnostic.  Perhaps a "class ranger_rich_location" which
implicitly supplies an instance of "class ranger_diagnostic_path" to
the diagnostic, describing where the range information comes from?
For debugging I have a trace, but it doesnt log the trace. just sort of logs as it does the walk backwards to determine things.  I use it to figure out where things have gone amok.

  THis sort of thing would probably be best tackled by simply inheriting from a ranger and overloading the core routines to log them.. thats effectively what the tracer does. just emits info before and after each of the core 5 calls.  I suppose one could just as easily log the info that comes back, and with a little bit of extra understanding, determine whether this is a "new" value or a previously computed one and automatically do what I am doing manually when determining the origination of something.


Here are some more out-there ideas I had some years ago (I can't
remember if these ever made it to the mailing list).  These ideas focus
more on how the ranges are used, rather than where the ranges come
from.  There's an interaction, though, so I think it's on-topic - can
the ranger code supply this kind of information?


what kind of information are we talking about?  if you are talking about where the ranges came from, im sure one could collect that together in a derived class.  The ranger itself is really just cobbling together the various sources of information that are calculated by its components.  Its really just an organizer and bookkeeper. You could probably cobble together whatever bits of information you wanted to stash them together.

Andrew

Reply via email to