Re: RFC: Representation of runtime offsets and sizes

Richard Sandiford Thu, 07 Sep 2017 04:08:19 -0700

Thanks for the quick feedback.

Richard Biener <richard.guent...@gmail.com> writes:
> On Wed, Sep 6, 2017 at 10:18 PM, Richard Sandiford
> <richard.sandif...@linaro.org> wrote:
>> The next main step in the SVE submission is to add support for
>> offsets and sizes that are a runtime invariant rather than a compile
>> time constant.  This is an RFC about our approach for doing that.
>> It's an update of https://gcc.gnu.org/ml/gcc/2016-11/msg00031.html
>> (which covered more topics than this message).
>>
>> The size of an SVE register in bits can be any multiple of 128 between
>> 128 and 2048 inclusive.  The way we chose to represent this was to
>> have a runtime indeterminate that counts the number of 128 bit blocks
>> above the minimum of 128.  If we call the indeterminate X then:
>>
>> * an SVE register has 128 + 128 * X bits (16 + 16 * X bytes)
>> * the last int in an SVE vector is at byte offset 12 + 16 * X
>> * etc.
>>
>> Although the maximum value of X is 15, we don't want to take advantage
>> of that, since there's nothing particularly magical about the value.
>>
>> So we have two types of target: those for which there are no runtime
>> indeterminates, and those for which there is one runtime indeterminate.
>> We decided to generalise the interface slightly by allowing any number
>> of indeterminates, although the underlying implementation is still
>> limited to 0 and 1 for now.
>>
>> The main class for working with these runtime offsets and sizes is
>> "poly_int".  It represents a value of the form:
>>
>>   C0 + C1 * X1 + ... + Cn * Xn
>>
>> where each coefficient Ci is a compile-time constant and where each
>> indeterminate Xi is a nonnegative runtime value.  The class takes two
>> template parameters, one giving the number of coefficients and one
>> giving the type of the coefficients.  There are then typedefs for the
>> common cases, with the number of coefficients being controlled by
>> the target.
>
> So a poly_int is a (nested) CHREC with (integer) constant CHREC_LEFT
> and CHREC_RIGHT (if CHREC_LEFT isn't such a CHREC itself):
>
> CHREC <CHREC <CHREC <C0, C1>, C2>, C3>
>
> ?


I guess you could view something that iterates over every possible
SVE implementation as a chrec, but I don't think it really applies
to places that use poly_int.  Each poly_int has a fixed value rather
than an evolving value, we just don't know what it is at compile time.

> For SVE you only need C0 + C1 * X1 but not the full general
> series, right?  What do you think would need the full general series?
> Just wonder if making it this general is really required.

Having the number of coefficients be a template parameter was mostly
a convenient way of handling 1 and 2 coefficients with the same class.
The current implementation doesn't handle 3 or more coefficients,
but I don't think we pay any penalty for allowing that possiblity
in future.

It's really hard to know whether 2 indeterminates will ever be needed.
A few years ago, people might have thought that we wouldn't even need
1 indeterminate, so I think it's dangerous to assume that 1 is always
going to be enough.

> Is there any way to get constraints on X1 here?  The Xs are
> implicit in poly-int and they have (implicitely) the same type
> as the Cs?

In terms of the type of X: all poly_ints use the same global X,
which has no particular type.  In effect it's "infinite precision".

The type of the result of the expansion really depends on context,
in the same way as it does for scalars.  The significance of the
coefficient type is whether it is known to hold all the values it
needs to hold, just like it is when current code uses things like
HOST_WIDE_INT.

So for example, SUBREG_BYTE is represented for size reasons as a
poly_uint16, since no valid subreg can have coefficients outside
uint16_t (the same range as GET_MODE_SIZE).  But there's no implicit
truncation to uint16_t in the multiplication and addition, so we can add
it directly to poly_int64s and poly_uint64s.  Adding two poly_uint16s
gives a poly_int64.

The only type for which the expansion is implicitly truncating is
poly_wide_int, since all wide_int arithmetic is truncating.  That doesn't
get used much though.

> In a way a more "useful" representation would be
>
> C0 + C1 * [X1min, X1max] + ... + Cn * [Xnmin, Xnmax]
>
> if we're talking about inventing sth that's not only useful for SVE.
> It's basically sth like a value-range representation for
> arbitrary(?) sequences.

I think it will always make sense to normalise the numbers so that
the X'mins are 0, since that makes the implementation much easier.
(We originally had X count 128-bit blocks, with 1 being the minimum
value, but it just made things unnecessarily complicated.)

In terms of the maximum: we deliberately don't want to encode a maximum
for SVE, since there's nothing particularly special about the current
limit of 2048 bits.  But I think it would be easy to add a limiting
mechanism if another target needed one.  The function interface would
be the same as it is now.

The implementation details are hidden in the header file and in the
rtl<->poly_int and tree<->poly_int conversion routines.  Almost
everything else treats the poly_int as abstract and just uses the
interfaces described in the documentation.  Those interfaces would
be the same with limited coefficients (or with the nested chrec
representation), so it should be possible to substitute different or
more general implementations in future without having to change the
use sites.

Thanks,
Richard

Re: RFC: Representation of runtime offsets and sizes

Reply via email to