On Mon, Feb 22, 2016 at 5:32 PM, Jeff Law <l...@redhat.com> wrote:
> On 02/22/2016 07:32 AM, Richard Biener wrote:
>
>>> Presumably DOM is not looking at r = s.r and realizing it could look s.r
>>> piece-wise in the available expression table.  If it did, it would
>>> effectively turn that fragment into:
>>>
>>>      s = { {1, 2}, 3 };
>>>      s.r.x = 1;
>>>      s.r.y = 2;
>>>      struct R r = {1, 2}
>>>      s.z = 3;
>>>
>>> At which point we no longer have the may-read of s.r.{x,y} and DSE would
>>> see
>>> the initial assignment as dead.
>>
>>
>> Yeah, but if it does not become dead you just increased code size or
>> lifetime...
>
> Increasing lifetimes is inherent in just about any CSE optimization. But as
> I mentioned, I'm not sure trying to add this aggregate handling to DOM is
> wise.
>
>>
>> FRE does something related - it looks at all piecewise uses of 'r' and
>> eventually replaces them with pieces of s.r when seeing the r = s.r
>> aggregate assignment.  Of course that only makes the store to r dead if
>> there
>> are no uses of it left.
>
> *If* we were to try and do something similar in DOM, we'd probably want to
> try and share much of the infrastructure.  I'll keep the FRE code in mind.
>
>>
>>> I also looked a bit at cases where we find that while an entire store
>>> (such
>>> as an aggregate initialization or mem*) may not be dead, pieces of the
>>> store
>>> may be dead.   That's trivial to detect.   It triggers relatively often.
>>> The trick is once detected, we have to go back and rewrite the original
>>> statement to only store the live parts.  I've only written the detection
>>> code, the rewriting might be somewhat painful.
>>
>>
>> Yes.  I think SRA has all the code to do that though, see how it
>> does scalarization of constant pool loads like
>
> Ohhh.  Good idea, I'll dig around SRA for a bit and see if there's something
> that can be re-used.
>
>>> I'm starting to wonder if what we have is a 3-part series.
>>>
>>> [1/3] The basic tracking to handle 33562, possibly included in gcc-6
>>> [2/3] Ignore reads that reference stuff not in live_bytes for gcc-7
>>> [3/3] Detect partially dead aggregate stores, rewriting the partially
>>>        dead store to only store the live bytes.  Also for gcc-7.
>>>
>>>
>>> Obviously [1/3] would need compile-time benchmarking, but I really don't
>>> expect any issues there.
>>
>>
>> So what's the overall statistic result on [1/3] if you exclude the
>> clobbers?
>
> Very few, call it a dozen, all in libstdc++.  They weren't significantly
> different than ssa-dse-9.c, so I didn't try to build nice reduced testcases
> for them given we've got existing coverage.
>
> One could argue that with the few real world cases that 33562 could be
> punted to P4 and and patch series deferred to gcc-7.  I wouldn't lose sleep
> over that option.

Way to go at this point IMHO.  That is, keep at P2 please, P4 is for sth else.
It's not a P1 blocker anyway.

Richard.

> Jeff
>

Reply via email to