On 9/9/20 7:03 PM, Martin Sebor wrote:
On 9/3/20 1:14 AM, Aldy Hernandez via Gcc wrote:
Below is a documented we have drafted to provide guidance on using irange's and converting passes to it.  This will future proof any such passes so that they will work with the ranger, or any other mechanism using multiple sub-ranges (as opposed to the more limited value_range).

The official document will live here, but is included below for discussion:

     https://gcc.gnu.org/wiki/irange-best-practices

Feel free to respond to this post with any questions or comments.

Thanks for the writeup!

The biggest question on my mind at the moment is probably about
how to go about rewriting classes that maintain their own ranges
(most often as an array of two offset_int) to use one of the Ranger
classes.  Two examples are the access_ref class in builtins.h and
the builtin_memref class in gimple-ssa-warn-restrict.c.

Both of these compute the size of an object (as a simple range)
and the offset into it (also as a range).  In the future, they will
track of sizes of multiple objects (from PHI nodes).

My thinking is to do this in two steps: In 1) replace each
offset_int[2] member array with an instance of int_range<1> and
then rewrite all the code that manipulates the array with calls
to the ranger API. That should be fairly straightforward. In 2)

This sounds like a sound approach. Instead of passing pairs of integers, pass an irange and limit yourself to the non-deprecated part of the API.

replace the simple int_range<1> with something more interesting
(int_range_max?) and rewrite the final code that processes

The ranger will use whatever sized range you pass it. So if you want fine granularity, pass it an int_range_max.

However, if your final destination in memory is constrained (you're storing a gazillion of these), you can use the irange_pool to allocate the minimum amount of storage needed. Note, that the irange_pool has not been contributed yet. It'll come with the ranger.

Taking the strlen code as an example, you could transform the following:

          vr_values *v = CONST_CAST (vr_values *, rvals);
          const value_range_equiv *vr = v->get_value_range (si->nonzero_chars);
          if (vr->kind () != VR_RANGE || !range_int_cst_p (vr))
            return false;

...into:

int_range_max r;
if (!ranger->range_of_expr (r, si->nonzero_chars, si->stmt))
  return false;

Where ranger is declared somewhere at the top of your pass. Perhaps in printf_strlen_execute() and pass it around like you do rvals.

Note that for better precision, you should pass the gimple context. That is, the statement from which si->nonzero_chars came from. In the above case, I think you want si->stmt. Currently get_value_range() has a STMT parameter which is unused. This was the reason for providing such parameter.

Finally... you can store R into an int_range<XX> depending on how much space you want to use in your ultimate storage location. The assignment operator will squish R into whatever space you give it. So the following will truncate things appropriately:

int_range<3> final_location = r;

However, if you're using irange_pool to allocate just the amount of space needed, you could do:

// Presumably declared wherever you declared the ranger.
irange_pool pool;
...
...
irange *p = pool.allocate (r);

p will hold a pointer to a range that contains R with no extra space. This is ideal for longer term storage. For intermediate results, use int_range_max.

Note that I am already converting sprintf/strlen to use the ranger instead of evrp/vr_values (aldyh/ranger-staging git branch). So the work of converting to range_of_expr above, and passing around gimple context, will already be done with the contribution of ranger. What I will not provide for now, is the wholesale conversion of other passes to the "clean" irange API. So if your pass is still using kind() and stashing away pairs of offset_int's, that will have to be done separately. I can offer guidance though, and help as time permits.

the results to scan all the subranges for overflow or overlap,
as well as the code that presents them in warnings/notes.  It
would be nice to have support for the formatting of ranges in
the pretty-printer to cut down on the repetitive tests that
determine how to format a constant (%E, vs a range [%E, %E],
vs a multi-range [%E, %E][%E, %E], ...[%E, %E]).

I'm not a big fan of including ranges in warnings/errors. The ranger and the vrp twins, are bound to change over releases with finer and finer ranges. Having our error messages depend on ranges that may or may not change can be confusing for users, and it means we must change regression tests to match a changing compiler at each release.

I would much rather see a generic warning of "P may be out of bounds" than "P may be [X,Y][Z,A][C,D]". That's just confusing :). Also, note that the ranger can come up with ranges that are completely non-obvious, as they pass through PHIs, back edges, and a combination of operations. This will only add to the user confusion.


I suppose I/we should write this up as a case study after I/we
do the first rewrite.

Longer term, the size of an object (or objects) and an offset
into it/them seems like a generally useful property that would
be best associated with a pointer somehow, so that it could be
queried by an API similar to the one exposed by the irange class.
I imagine it would benefit both warnings and optimizations.

I do have one concern with a wholesale switch over to the ranger
classes and APIs.  Being designed with conservative assumptions
suitable for optimizers, the current APIs aren't necessarily
ideal for warnings,  A basic example is integer overflow and
wrapping.  In the general case, the optimizer must assume that
the range of the argument in 'malloc (n + 1)' might overflow (or
wrap).  A warning, though, should detect when it does and complain.
So I'd like to see a mode where the ranger evaluates expressions
in infinite precision.

The ranger will evaluate an expression in the mode of the trees there-in, nothing more, nothing less. If your range has an underlying type that wraps, the result will also wrap.

If you want to perform an operation ignoring such constraints, you probably want to cast to a larger type before doing the operation:

range_cast (range_n, SOME_LARGER_INT_TYPE);
range_cast (range_1, SOME_LARGER_INT_TYPE);

int_range_max result;
op.plus.fold_range (result, SOME_LARGER_INT_TYPE, range_n, range_1);

But in the case of malloc(n+1) above, the argument sits in an SSA, and the range plus operation was done in the underlying type. There's no way of changing that. The ranger won't lie to you. If the operation may wrap or overflow, it will show up as so in the range. I don't think we have any plans of changing this behavior.

Was that your question?

Aldy

Reply via email to