> On Jun 4, 2018, at 10:09 AM, Jeff Law <l...@redhat.com> wrote:
>
> On 06/04/2018 08:06 AM, Paul Koning wrote:
>>
>>
>>> On Jun 4, 2018, at 9:51 AM, Jeff Law <l...@redhat.com> wrote:
>>>
>>> On 06/04/2018 07:31 AM, Paul Koning wrote:
>>>> The internals manual in its description of the "matching constraint" says
>>>> that it works for cases where the in and out operands are somewhat
>>>> different, such as *p++ vs. *p. Obviously that is meant to cover post_inc
>>>> side effects.
>>>>
>>>> The curious thing is that auto-inc-dec.c specifically avoids doing this:
>>>> if it finds what looks like a suitable candidate for auto-inc or auto-dec
>>>> optimization but that operand occurs more than once in the insn, it
>>>> doesn't make the change. The result is code that's both larger and slower
>>>> for machines that have post_inc etc. addressing modes. The gccint
>>>> documentation suggests that it was the intent to optimize this case, so I
>>>> wonder why it is avoided.
>>> I wouldn't be terribly surprised if the old flow.c based auto-inc
>>> discovery handled this, but the newer auto-inc-dec.c doesn't. The docs
>>> were probably written prior to the conversion.
>>
>> That fits, because there is a reference to "the flow pass of the compiler"
>> when these constructs are introduced in section 14.16.
>>
>> So is this an omission, or is there a reason why that optimization was
>> removed?
> I would guess omission, probably on the assumption it wasn't terribly
> important and there wasn't really a good way to test it. There aren't
> many targets that use auto-inc getting a lot of attention these days,
> and those that do can't have multiple memory operands.
By "multiple memory operands" do you mean both source and dest in memory? Ok,
but I didn't mean that specifically. The issue is on an instruction with a
read/modify/write destination operand, like two-operand add. If the
destination looks like a candidate for post-inc, it's skipped because it shows
up twice in the RTL -- since that uses three operand notation.
For example:
for (int i = 0; i < n; i++)
*p++ += i;
produces (on pdp11):
add $02, r0
add r1, -02(r0)
rather than simply "add r1, (r0)+". But if I change the += to =, the expected
optimization does take place.
paul