Re: irange best practices document

Aldy Hernandez via Gcc Thu, 10 Sep 2020 01:12:26 -0700

On 9/9/20 7:03 PM, Martin Sebor wrote:

On 9/3/20 1:14 AM, Aldy Hernandez via Gcc wrote:

Below is a documented we have drafted to provide guidance on usingirange's and converting passes to it. This will future proof any suchpasses so that they will work with the ranger, or any other mechanismusing multiple sub-ranges (as opposed to the more limited value_range).
The official document will live here, but is included below fordiscussion:
     https://gcc.gnu.org/wiki/irange-best-practices

Feel free to respond to this post with any questions or comments.


Thanks for the writeup!

The biggest question on my mind at the moment is probably about
how to go about rewriting classes that maintain their own ranges
(most often as an array of two offset_int) to use one of the Ranger
classes.  Two examples are the access_ref class in builtins.h and
the builtin_memref class in gimple-ssa-warn-restrict.c.

Both of these compute the size of an object (as a simple range)
and the offset into it (also as a range).  In the future, they will
track of sizes of multiple objects (from PHI nodes).

My thinking is to do this in two steps: In 1) replace each
offset_int[2] member array with an instance of int_range<1> and
then rewrite all the code that manipulates the array with calls
to the ranger API. That should be fairly straightforward. In 2)

This sounds like a sound approach. Instead of passing pairs ofintegers, pass an irange and limit yourself to the non-deprecated partof the API.

replace the simple int_range<1> with something more interesting
(int_range_max?) and rewrite the final code that processes

The ranger will use whatever sized range you pass it. So if you wantfine granularity, pass it an int_range_max.

However, if your final destination in memory is constrained (you'restoring a gazillion of these), you can use the irange_pool to allocatethe minimum amount of storage needed. Note, that the irange_pool hasnot been contributed yet. It'll come with the ranger.


Taking the strlen code as an example, you could transform the following:

          vr_values *v = CONST_CAST (vr_values *, rvals);
          const value_range_equiv *vr = v->get_value_range (si->nonzero_chars);
          if (vr->kind () != VR_RANGE || !range_int_cst_p (vr))
            return false;

...into:

int_range_max r;
if (!ranger->range_of_expr (r, si->nonzero_chars, si->stmt))
  return false;

Where ranger is declared somewhere at the top of your pass. Perhaps inprintf_strlen_execute() and pass it around like you do rvals.

Note that for better precision, you should pass the gimple context.That is, the statement from which si->nonzero_chars came from. In theabove case, I think you want si->stmt. Currently get_value_range() hasa STMT parameter which is unused. This was the reason for providingsuch parameter.

Finally... you can store R into an int_range<XX> depending on how muchspace you want to use in your ultimate storage location. The assignmentoperator will squish R into whatever space you give it. So thefollowing will truncate things appropriately:


int_range<3> final_location = r;

However, if you're using irange_pool to allocate just the amount ofspace needed, you could do:


// Presumably declared wherever you declared the ranger.
irange_pool pool;
...
...
irange *p = pool.allocate (r);

p will hold a pointer to a range that contains R with no extra space.This is ideal for longer term storage. For intermediate results, useint_range_max.

Note that I am already converting sprintf/strlen to use the rangerinstead of evrp/vr_values (aldyh/ranger-staging git branch). So thework of converting to range_of_expr above, and passing around gimplecontext, will already be done with the contribution of ranger. What Iwill not provide for now, is the wholesale conversion of other passes tothe "clean" irange API. So if your pass is still using kind() andstashing away pairs of offset_int's, that will have to be doneseparately. I can offer guidance though, and help as time permits.

the results to scan all the subranges for overflow or overlap,
as well as the code that presents them in warnings/notes.  It
would be nice to have support for the formatting of ranges in
the pretty-printer to cut down on the repetitive tests that
determine how to format a constant (%E, vs a range [%E, %E],
vs a multi-range [%E, %E][%E, %E], ...[%E, %E]).

I'm not a big fan of including ranges in warnings/errors. The rangerand the vrp twins, are bound to change over releases with finer andfiner ranges. Having our error messages depend on ranges that may ormay not change can be confusing for users, and it means we must changeregression tests to match a changing compiler at each release.

I would much rather see a generic warning of "P may be out of bounds"than "P may be [X,Y][Z,A][C,D]". That's just confusing :). Also, notethat the ranger can come up with ranges that are completely non-obvious,as they pass through PHIs, back edges, and a combination of operations.This will only add to the user confusion.


I suppose I/we should write this up as a case study after I/we
do the first rewrite.

Longer term, the size of an object (or objects) and an offset
into it/them seems like a generally useful property that would
be best associated with a pointer somehow, so that it could be
queried by an API similar to the one exposed by the irange class.
I imagine it would benefit both warnings and optimizations.

I do have one concern with a wholesale switch over to the ranger
classes and APIs.  Being designed with conservative assumptions
suitable for optimizers, the current APIs aren't necessarily
ideal for warnings,  A basic example is integer overflow and
wrapping.  In the general case, the optimizer must assume that
the range of the argument in 'malloc (n + 1)' might overflow (or
wrap).  A warning, though, should detect when it does and complain.
So I'd like to see a mode where the ranger evaluates expressions
in infinite precision.

The ranger will evaluate an expression in the mode of the treesthere-in, nothing more, nothing less. If your range has an underlyingtype that wraps, the result will also wrap.

If you want to perform an operation ignoring such constraints, youprobably want to cast to a larger type before doing the operation:


range_cast (range_n, SOME_LARGER_INT_TYPE);
range_cast (range_1, SOME_LARGER_INT_TYPE);

int_range_max result;
op.plus.fold_range (result, SOME_LARGER_INT_TYPE, range_n, range_1);

But in the case of malloc(n+1) above, the argument sits in an SSA, andthe range plus operation was done in the underlying type. There's noway of changing that. The ranger won't lie to you. If the operationmay wrap or overflow, it will show up as so in the range. I don't thinkwe have any plans of changing this behavior.


Was that your question?

Aldy

Re: irange best practices document

Reply via email to