Aldy & Andrew
INTRODUCTION
------------
irange is a class for storing and manipulating multi-ranges of
integers. It is meant to be a replacement for value_range's, and can
currently inter-operate seamlessly with them.
Original value_range's can contain a range of integers, say from 10 to
20 inclusive, represented as [10, 20]. It also has a way of
representing an anti-range, which is the inverse of a range. For
example, everything except 10 to 20 is represented as an anti-range of
~[10, 20]. This is really, shorthand for the union of
[MIN_INT, 9] U [21, MAX_INT].
The value_range representation has the limitation that higher
granularity is not representable without losing precision. For
example, you cannot specify the range of [10, 15] U [20, 30], instead
it is represented ambiguously with with a range of [10, 30].
On the other hand, multi-ranges with the irange class, can represent
an arbitrary number of sub-ranges. More formally, multi-ranges have
0 or more non-intersecting sub-ranges with integral bounds. For
example, you can specify a range containing the numbers of [10, 15]
and [20, 30] with an irange of [10, 15][20, 30]. With irange, instead
of using anti-ranges for ~[10, 20] the underlying number of ranges can
be represented accurately with [MIN_INT, 9][21, MAX_INT].
Multi-ranges are not limited to 1 or 2 sub-ranges. Instead, you can
specify any number of sub-ranges. For example:
int_range<5> five_pairs;
int_range<2> two_pairs;
int_range<1> legacy_value_range;
widest_irange huge_irange; // currently 255 sub-ranges.
The special case of int_range<1> provides legacy support for
value_range. Currently, value_range is just a typedef for int_range<1>.
The specially named widest_irange is used for "unlimited" sub-ranges,
and is meant to be used when calculating intermediate results.
Currently it is a large number (255), but could be changed without
prior notice. Note that "widest" does not have anything to do with
the range of the underlying integers, but the maximum amount of
sub-range pairs available for calculation.
Here are some examples of calculations with different sub-range
granularity:
// Assume:
tree n10 = build_int_cst (integer_type_node, 10);
tree n20 = build_int_cst (integer_type_node, 20);
tree n30 = build_int_cst (integer_type_node, 30);
tree n40 = build_int_cst (integer_type_node, 40);
tree n50 = build_int_cst (integer_type_node, 50);
tree n60 = build_int_cst (integer_type_node, 60);
int_range<1> (aka value_range):
int_range<1> r1 (n10, n20);
int_range<1> r2 (n30, n40);
r1.union_ (r2);
r1.dump ();
// Outputs: [10, 40]
// Note: Since result doesn't fit in an int_range<1>, it is
// conservatively truncated.
int_range<2>:
int_range<2> r1 (n10, n20);
int_range<2> r2 (n30, n40);
r1.union_ (r2);
r1.dump ();
// Outputs: [10, 20][30, 40]
int_range<2> r3 (n50, n60);
r1.union_ (r3);
r1.dump ();
// Outputs: [10, 20][30, 60]
// Note: Unioning [10, 20][30, 40] and [50, 60] doesn't fit
// into an int_range<2>, so it is truncated.
int_range<1> small;
small = r1; // r1 is [10,20][30,60] as above.
small.dump ();
// Outputs: [10, 60]
// Note: Copying a larger range into a smaller range will
// result in truncating the result.
widest_irange:
widest_irange r;
for (i = 0; i <= 40; i += 10) {
tree low = build_int_cst (integer_type_node, i);
tree high = build_int_cst (integer_type_node, i + 1);
int_range<2> tmp (low, high);
r.union_ (tmp);
}
r.dump ();
// Outputs: [0, 1][10, 11][20, 21][30, 31][40, 41]
WRITE YOUR CODE FOR INFINITE SUB-RANGES
---------------------------------------
Your code should be agnostic to the sub-ranges there-in. For example,
if you wanted to check if the numbers 5 through 10 are in a range R,
avoid looking at min/max/kind and having to special case VR_RANGE,
VR_ANTI_RANGE, and VR_VARYING. Instead do something like this:
widest_irange my_range (build_int_cst (5, integer_type_node),
build_int_cst (10, integer_type_node));
my_range.intersect (R);
if (R == my_range)
return goodness;
If on the other hand, if you want to check if there is ANY of the
numbers from 5 through 10 are in the range, you could do:
my_range.intersect (R);
if (!R.undefined_p ())
return goodness;
First, don't be afraid to build throw-away ranges. They're cheap,
light-weight, and don't require memory allocations.
Also, you could also have declared my_range as containing a max of 3
sub-ranges with:
int_range<3> my_range;
However, the result of the intersect code above will be limited to 3
sub-ranges. Sometimes 3 is all you need, but we suggest that
intermediate results be done in widest_irange (currently 255 sub-ranges)
and then the result be copied to whatever range you wish to return.
For example:
void some_calculation(irange &result, const irange &a, const irange &b)
{
widest_irange intermediate = a;
do_stuff (intermediate, b);
result.intersect (intermediate);
}
Note that operations on ranges are done in the granularity of the
range itself, not its argument. For example, BIG.intersect(SMALL) is
done in the granularity of BIG. On the other hand, if SMALL is an
int_range<1> and BIG is an int_range<5>, SMALL.intersect(BIG) will be
squished down to the granularity of SMALL.
Example of iterating through sub-ranges in a range:
if (r.undefined_p ())
return;
for (unsigned i = 0; i < r.num_pairs (); ++i) {
wide_int x = r.lower_bound (i);
wide_int y = r.upper_bound (i);
// do stuff with x/y
}
// Note: undefined_p() is syntactic sugar for num_pairs() == 0.
FUNCTION SHOULD TAKE IN IRANGE'S AND STICK TO THE API
-----------------------------------------------------
All functions taking in value_range's should be rewritten to take in
an irange instead. Taking in the irange base class guarantees your
code works with any range.
You should keep to the main API which is listed in value-range.h,
while avoiding the methods marked with DEPRECATED, especially kind (),
min (), max (), symbolic_p (), which are all slated to be removed.
For example, symbolic_p should be avoided, because 9 out of 10 times
passes don't need symbolics. Exceptions to these are the VRP twins
(evrp / VRP) which need it internally. But virtually all passes that
currently use the evrp_range_analyzer, never do anything with a
symbolic range returned by get_value_range. Also, get_range_info()
returns global range information, and does not use symbolics.
The API normalizes symbolics into integers, so you can use it and
assume the range contains only numbers. For example, a symbolic of
[5,X] is normalized into [5,MAX] when accessed through the API. That
is, the lower bound of the range is 5, and the upper bound is
TYPE_MAX_VALUE for the type. Similarly, a symbolic of [A, B] is
normalized to [MIN,MAX].
If your functions take in an irange and limit themselves to the
non-deprecated API, your code will be guaranteed to work with either
one or an infinite amount of sub-ranges when available.
Note that if your pass needs symbolics and you're not evrp/vrp
(really, you don't), your ability to make use of multi-ranges is
limited to operations involving 2 constant ranges, and most operations
will be truncated down to a value_range, thus defeating the purpose.
If your pass is like 90% of passes, just stick to the API, use
iranges, and you'll be fine.
Keep in mind that passing a multi-range into the EVRP eco-system won't
buy you anything, as it is written entirely with value_range's. You
need to pass a multi-range to the ranger, or range-ops to get anything
with multi-ranges back.
ANTI-RANGES
-----------
Anti ranges don't really exist. Although they currently exist in the
internal representation of the legacy code, you shouldn't be checking
for them, and you shouldn't be treating them specially. You should be
asking for the bounds of a range or sub-range, not for anti-range in
particular.
For example, ~[5,10] is really [0..4][11,MAX], and should be treated
as such.
Cumbersome operations that were previous done with anti-ranges are
much easier now. For example, previously if you were to write a
predicate to check if X is in range R you would have to:
check that R is not undefined_p()
check that R is not varying_p()
check that R is either a VR_RANGE or a VR_ANTI_RANGE
check that R is a constant_p (that is, is not a symbolic)
check that R.kind()==VR_RANGE && X >= R.min() && X <= R.max()
or....that R.kind()==VR_ANTI_RANGE && X < R.min() && X > R.max()
With the new API you check membership with:
R.contains_p(X)
VARYING AND UNDEFINED RANGES
----------------------------
The old VARYING nomenclature is now just syntactic sugar for a range of
[MIN,MAX]. You can safely assume that any range (with the exception of
an undefined range) has bounds. So the lower_bound() of a range that
has a varying is the minimum value for the type. You shouldn't have to
special case handling varying ranges. They're just ranges that span the
entire domain of the underlying type.
Undefined ranges are syntactic sugar for the empty set (that is
R.num_pairs() == 0). Since you obviously can't ask for the bounds of
an undefined_p() range, you may have to check that a range is not the
empty set first before asking for its bounds.
RETURNING RANGE RESULTS FROM A FUNCTION
---------------------------------------
Note, that functions that return ranges should do so by passing a
reference (or pointer, though we prefer references). The reason is that
you want the caller to decide what sub-range granularity they desire.
Otherwise you end up returning by value an int_range<5>, when all the
user wanted was an int_range<2>. Or worse, you truncate the results
when the user wants a higher precision.
There are various setters that are quicker than full-on copying. You
should use them when possible. For example,
instead of:
result = int_range<1> (build_zero_cst (integer_type_node),
build_zero_cst (integer_type_node));
do:
result.set_zero (integer_type_node);
instead of:
result = int_range<1> (a, b);
do:
result.set (a, b);
VALUE_RANGE'S ARE NOT EVIL, BUT ARE SEVERELY LIMITED
----------------------------------------------------
Note that int_range<1> and value_range are the same thing. In fact,
value_range is typedef'd to be an int_range<1>. However, we prefer
int_range<1>, as value_range is deprecated because it implies legacy
support. Do not that your functions should take in irange's, not
int_range<> or value_range, as this this ensures that it will work
with any passed range.
PERSISTENT IRANGES
------------------
If you wish to persistently keep iranges beyond the scope of your
function, we will provide an irange_pool class when the ranger is
contributed. It will provide a mechanism for dynamically allocating
ranges. More details will follow, but the general gist is:
irange_pool pool;
// Allocate an irange of 5 sub-ranges.
irange *p = pool.allocate (5);
// Allocate an irange of 3 sub-ranges.
irange *q = pool.allocate (3);
// Allocate an irange with as many sub-ranges as are currently
// used in "some_other_range".
irange *r = pool.allocate (some_other_range);
In the above snippets, the "pool" object will deallocate any
ranges at destruction.