Christophe Lyon <christophe.l...@linaro.org> writes:
> On Tue, 15 Oct 2019 at 12:36, Richard Sandiford
> <richard.sandif...@arm.com> wrote:
>>
>> Richard Biener <richard.guent...@gmail.com> writes:
>> > On October 14, 2019 2:32:43 PM GMT+02:00, Richard Sandiford 
>> > <richard.sandif...@arm.com> wrote:
>> >>Richard Biener <richard.guent...@gmail.com> writes:
>> >>> On Fri, Oct 11, 2019 at 4:42 PM Richard Sandiford
>> >>> <richard.sandif...@arm.com> wrote:
>> >>>>
>> >>>> The range-tracking code has a pretty hard-coded assumption that
>> >>>> is_gimple_min_invariant is equivalent to "INTEGER_CST or invariant
>> >>>> ADDR_EXPR".  It seems better to add a predicate specifically for
>> >>>> that rather than contiually fight cases in which it can't handle
>> >>>> other invariants.
>> >>>>
>> >>>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>> >>>
>> >>> ICK.  Nobody is going to remember this new restriction and
>> >>> constant_range_value_p reads like constant_value_range_p ;)
>> >>>
>> >>> Btw, is_gimple_invariant_address shouldn't have been exported,
>> >>> it's only use could have used is_gimple_min_invariant...
>> >>
>> >>What do you think we should do instead?
>> >
>> > Just handle POLY_INT_CST in a few place to quickly enough drop to varying.
>>
>> OK, how about this?  Aldy's suggestion would be fine by me too,
>> but I thought I'd try this first given Aldy's queasiness about
>> allowing POLY_INT_CSTs further in.
>>
>> The main case in which this gives useful ranges is a lower bound
>> of A + B * X becoming A when B >= 0.  E.g.:
>>
>>   (1) [32 + 16X, 100] -> [32, 100]
>>   (2) [32 + 16X, 32 + 16X] -> [32, MAX]
>>
>> But the same thing can be useful for the upper bound with negative
>> X coefficients.
>>
>> We can revisit this later if keeping a singleton range for (2)
>> would be better.
>>
>> Tested as before.
>>
>> Richard
>>
>>
> Hi Richard,
>
> This patch did improve aarch64 results quite a lot, however, there are
> still a few failures that used to pass circa r276650:
>     gcc.target/aarch64/sve/loop_add_6.c -march=armv8.2-a+sve
> scan-assembler \\tfsub\\tz[0-9]+\\.d, p[0-7]/m
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\tadd\\tz[0-9]+\\.b, p[0-7]/m, z[0-9]+\\.b,
> z[0-9]+\\.b\\n 1
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\tadd\\tz[0-9]+\\.d, p[0-7]/m, z[0-9]+\\.d,
> z[0-9]+\\.d\\n 2
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\tadd\\tz[0-9]+\\.h, p[0-7]/m, z[0-9]+\\.h,
> z[0-9]+\\.h\\n 1
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\tadd\\tz[0-9]+\\.s, p[0-7]/m, z[0-9]+\\.s,
> z[0-9]+\\.s\\n 2
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\tand\\tz[0-9]+\\.b, p[0-7]/m, z[0-9]+\\.b,
> z[0-9]+\\.b\\n 2
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\tand\\tz[0-9]+\\.d, p[0-7]/m, z[0-9]+\\.d,
> z[0-9]+\\.d\\n 2
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\tand\\tz[0-9]+\\.h, p[0-7]/m, z[0-9]+\\.h,
> z[0-9]+\\.h\\n 2
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\tand\\tz[0-9]+\\.s, p[0-7]/m, z[0-9]+\\.s,
> z[0-9]+\\.s\\n 2
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\teor\\tz[0-9]+\\.b, p[0-7]/m, z[0-9]+\\.b,
> z[0-9]+\\.b\\n 2
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\teor\\tz[0-9]+\\.d, p[0-7]/m, z[0-9]+\\.d,
> z[0-9]+\\.d\\n 2
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\teor\\tz[0-9]+\\.h, p[0-7]/m, z[0-9]+\\.h,
> z[0-9]+\\.h\\n 2
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\teor\\tz[0-9]+\\.s, p[0-7]/m, z[0-9]+\\.s,
> z[0-9]+\\.s\\n 2
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\tfadd\\tz[0-9]+\\.d, p[0-7]/m, z[0-9]+\\.d,
> z[0-9]+\\.d\\n 1
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\tfadd\\tz[0-9]+\\.h, p[0-7]/m, z[0-9]+\\.h,
> z[0-9]+\\.h\\n 1
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\tfadd\\tz[0-9]+\\.s, p[0-7]/m, z[0-9]+\\.s,
> z[0-9]+\\.s\\n 1
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\torr\\tz[0-9]+\\.b, p[0-7]/m, z[0-9]+\\.b,
> z[0-9]+\\.b\\n 2
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\torr\\tz[0-9]+\\.d, p[0-7]/m, z[0-9]+\\.d,
> z[0-9]+\\.d\\n 2
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\torr\\tz[0-9]+\\.h, p[0-7]/m, z[0-9]+\\.h,
> z[0-9]+\\.h\\n 2
>     gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
> scan-assembler-times \\torr\\tz[0-9]+\\.s, p[0-7]/m, z[0-9]+\\.s,
> z[0-9]+\\.s\\n 2
>     gcc.target/aarch64/sve/reduc_5.c -march=armv8.2-a+sve
> scan-assembler-times \\tfsub\\tz[0-9]+\\.d, p[0-7]/m 1
>     gcc.target/aarch64/sve/reduc_5.c -march=armv8.2-a+sve
> scan-assembler-times \\tfsub\\tz[0-9]+\\.s, p[0-7]/m 1
>     gcc.target/aarch64/sve/reduc_5.c -march=armv8.2-a+sve
> scan-assembler-times \\tsub\\tz[0-9]+\\.b, p[0-7]/m 1
>     gcc.target/aarch64/sve/reduc_5.c -march=armv8.2-a+sve
> scan-assembler-times \\tsub\\tz[0-9]+\\.d, p[0-7]/m 2
>     gcc.target/aarch64/sve/reduc_5.c -march=armv8.2-a+sve
> scan-assembler-times \\tsub\\tz[0-9]+\\.h, p[0-7]/m 1
>     gcc.target/aarch64/sve/reduc_5.c -march=armv8.2-a+sve
> scan-assembler-times \\tsub\\tz[0-9]+\\.s, p[0-7]/m 2
>
> Just to make sure you are aware of these :-)

Yeah, aware of them :-)  These were UNRESOLVED before the patch due
to the VRP ICEs.  I assume they're from the recent reduction changes,
but I haven't had chance to look at them in detail yet.

Thanks,
Richard

Reply via email to