Am 27.07.2016 um 23:31 schrieb Jeff Law:
> On 07/26/2016 11:32 AM, Bernd Edlinger wrote:
>> Hi!
>>
>> As described in PR 71779 and PR 70903 we have a wrong code issue on
>> aarch64
>> which is triggered by a paradoxical subreg that can be created in
>> store_bit_field_using_insv when the target has an EP_insv bitfield
>> instruction.
>>
>> The attached patch changes this insn...
>>
>> (insn 1047 1046 1048 (set (reg:DI 481)
>>         (subreg:DI (reg/f:SI 479) 0)) y.c:12702 -1
>>      (nil))
>>
>> ... into a more simple insn:
>>
>> (insn 1047 1046 1048 (set (reg:DI 479)
>>         (zero_extend:DI (reg:SI 480))) isl_input.c:2496 -1
>>      (nil))
>>
>> which manages to fix two bugs at the same time.
>>
>> The patch does not include a test case, as I was looking at the PR 71779
>> just by curiosity, and I have only a theoretical interest in aarch64.
>> However,
>> it should be easy to extract one from PR 70903 at least, but I can't
>> do that by
>> myself.
>>
>> Therefore I would like to leave the test case(s) to Cristina Tamar
>> from ARM,
>> and/or Andreas Schwab, who have also helped out with bootstrapping and
>> reg-testing on aarch64 and aarch64-ilp32.
>>
>> Bootstrap and reg-test on x86_64-linux-gnu with no regressions.
>> Successfully bootstrapped on aarch64_ilp32 and the ISL testsuite passes.
>> Also successfully bootstrapped on an aarch64 juno board and no
>> regressions.
>>
>>
>> Is it OK for trunk?
>>
>>
>> Thanks
>> Bernd.
>>
>>
>> patch-pr71779.diff
>>
>>
>> 2016-07-25  Bernd Edlinger  <bernd.edlin...@hotmail.de>
>>
>>     PR rtl-optimization/71779
>>     PR rtl-optimization/70903
>>     * expmed.c (store_bit_field_using_insv): Don't generate a paradoxical
>>     subreg.
> So you're stumbling into another really interesting area.
>
> I can hazard a guess that the reason we create the paradoxical SUBREG
> rather than a zero or sign extension is because various optimizers know
> that the upper bits in a paradoxical are don't care bits and may make
> transformations with that knowledge.
>
> Once you turn that into a zero/sign extend, then the rest of the
> optimizers must preserve the zero/sign extension property -- they have
> no way of knowing the bits were don't cares.
>
> So it's not necessarily clear that your change results in generally
> better code or not.
>

Yes, I understand, on x86 the stage 2/3 did not change but x86 has
probably not a very sophisticated insv code pattern.

> More importantly, it really feels like you're papering over a real bug
> elsewhere.  AFAICT the use of a paradoxical subreg here is safe.  So the
> real bug ought to be downstream of this point.  Several folks have
> pointed at reload, which would certainly be a point ripe for a
> paradoxical subreg problem.
>

I debugged lra already, because I was initially sure it did something
wrong with the subreg, but it turned out that it follows exactly what
you say here, it spills the subreg twice, once as a 32 bit value,
into a 64 bit slot with stack image in the upper word and then again
as a zero extended 64 bit value.  Later the garbage extended 64 bit
value is used when it should not.

So lra seems not to be the reason, but it is apparently spilling the
paradoxical subreg twice.

So at least lra does not generate better code from paradoxical subreg.

> Sooo. I don't think we can go with this patch as-is today.  We need to
> find the root problem which I believe is later in the pipeline.
>
> This patch might make sense from an optimization standpoint -- I'm
> generally of the opinion that while paradoxical subregs give the
> optimziers more freedom, the optimizers rarely are able to use it and
> they generally know much more about the semantics of and how to optimize
> around zero/sign extensions.  But we really should fix the bug first,
> then come back to improving the code generation.
>

Yes.  And paradoxical subreg have non-intuitive semantics.


Thanks
Bernd.

> Now if someone wanted to look for an interesting optimization project --
> ree.c knows how to do redundant extension eliminations.  A paradoxical
> is a form of extension that isn't currently understood by REE. Similarly
> REE doesn't know how to handle zero-extension as an "and" or sign
> extension as a pair of shifts.  Both of which can occur in practice due
> to costs or ISA limitations.  I wouldn't be surprise if adding those
> capabilities to REE would significantly improve its ability to eliminate
> unnecessary extensions.
>
> Jeff
>

Reply via email to