Re: [103/nnn] poly_int: TYPE_VECTOR_SUBPARTS

Richard Biener Tue, 24 Oct 2017 04:25:59 -0700

On Tue, Oct 24, 2017 at 1:18 PM, Richard Sandiford
<richard.sandif...@linaro.org> wrote:
> Richard Biener <richard.guent...@gmail.com> writes:
>> On Tue, Oct 24, 2017 at 11:40 AM, Richard Sandiford
>> <richard.sandif...@linaro.org> wrote:
>>> Richard Biener <richard.guent...@gmail.com> writes:
>>>> On Mon, Oct 23, 2017 at 7:41 PM, Richard Sandiford
>>>> <richard.sandif...@linaro.org> wrote:
>>>>> This patch changes TYPE_VECTOR_SUBPARTS to a poly_uint64.  The value is
>>>>> encoded in the 10-bit precision field and was previously always stored
>>>>> as a simple log2 value.  The challenge was to use this 10 bits to
>>>>> encode the number of elements in variable-length vectors, so that
>>>>> we didn't need to increase the size of the tree.
>>>>>
>>>>> In practice the number of vector elements should always have the form
>>>>> N + N * X (where X is the runtime value), and as for constant-length
>>>>> vectors, N must be a power of 2 (even though X itself might not be).
>>>>> The patch therefore uses the low bit to select between constant-length
>>>>> and variable-length and uses the upper 9 bits to encode log2(N).
>>>>> Targets without variable-length vectors continue to use the old scheme.
>>>>>
>>>>> A new valid_vector_subparts_p function tests whether a given number
>>>>> of elements can be encoded.  This is false for the vector modes that
>>>>> represent an LD3 or ST3 vector triple (which we want to treat as arrays
>>>>> of vectors rather than single vectors).
>>>>>
>>>>> Most of the patch is mechanical; previous patches handled the changes
>>>>> that weren't entirely straightforward.
>>>>
>>>> One comment, w/o actually reviewing may/must stuff (will comment on that
>>>> elsewhere).
>>>>
>>>> You split 10 bits into 9 and 1, wouldn't it be more efficient to use the
>>>> lower 8 bits for the log2 value of N and either of the two remaining bits
>>>> for the flag?  That way the 8 bits for the shift amount can be eventually
>>>> accessed in a more efficient way.
>>>>
>>>> Guess you'd need to compare code-generation of the TYPE_VECTOR_SUBPARTS
>>>> accessor on aarch64 / x86_64.
>>>
>>> Ah, yeah.  I'll give that a go.
>>>
>>>> Am I correct that NUM_POLY_INT_COEFFS is 1 for targets that do not
>>>> have variable length vector modes?
>>>
>>> Right.  1 is the default and only AArch64 defines it to anything else (2).
>>
>> Going to be interesting (bitrot) times then?  I wonder if it makes sense
>> to initially define it to 2 globally and only change it to 1 later?
>
> Well, the target-independent code doesn't have the implicit conversion
> from poly_int<1, C> to C, so it can't e.g. do:
>
>   poly_int64 x = ...;
>   HOST_WIDE_INT y = x;
>
> even when NUM_POLY_INT_COEFFS==1.  Only target-specific code (identified
> by IN_TARGET_CODE) can do that.
>
> So to target-independent code it doesn't really matter what
> NUM_POLY_INT_COEFFS is.  Even if we bumped it to 2, the extra coefficient
> would always be zero.
>
> FWIW, the poly_int tests in [001/nnn] cover N == 1, 2 and (as far as
> supported) 3 for all targets, so that part isn't sensitive to
> NUM_POLY_INT_COEFFS.
>
>> Do you have any numbers on the effect of poly-int on compile-times?
>> Esp. for example on stage2 build times when stage1 is -O0 -g "optimized"?
>
> I've just tried that for an x86_64 -j24 build and got:
>
> real: +7%
> user: +8.6%
>
> I don't know how noisy the results are though.


What's the same on AARCH64 where NUM_POLY_INT_COEFFS is 2?

> It's compile-time neutral in terms of running a gcc built with
> --enable-checking=release, within a margin of about [-0.1%, 0.1%].

I would have expected that (on x86_64).  Well, hoped (you basically
stated that in 000/nnn.  The question is what is the effect on AARCH64.
As you know we build openSUSE for AARCH64 and build power is limited ;)

Richard.

> Thanks,
> Richard

Re: [103/nnn] poly_int: TYPE_VECTOR_SUBPARTS

Reply via email to