On 02/22/2016 07:32 AM, Richard Biener wrote:
Presumably DOM is not looking at r = s.r and realizing it could look s.r
piece-wise in the available expression table. If it did, it would
effectively turn that fragment into:
s = { {1, 2}, 3 };
s.r.x = 1;
s.r.y = 2;
struct R r = {1, 2}
s.z = 3;
At which point we no longer have the may-read of s.r.{x,y} and DSE would see
the initial assignment as dead.
Yeah, but if it does not become dead you just increased code size or lifetime...
Increasing lifetimes is inherent in just about any CSE optimization.
But as I mentioned, I'm not sure trying to add this aggregate handling
to DOM is wise.
FRE does something related - it looks at all piecewise uses of 'r' and
eventually replaces them with pieces of s.r when seeing the r = s.r
aggregate assignment. Of course that only makes the store to r dead if there
are no uses of it left.
*If* we were to try and do something similar in DOM, we'd probably want
to try and share much of the infrastructure. I'll keep the FRE code in
mind.
I also looked a bit at cases where we find that while an entire store (such
as an aggregate initialization or mem*) may not be dead, pieces of the store
may be dead. That's trivial to detect. It triggers relatively often.
The trick is once detected, we have to go back and rewrite the original
statement to only store the live parts. I've only written the detection
code, the rewriting might be somewhat painful.
Yes. I think SRA has all the code to do that though, see how it
does scalarization of constant pool loads like
Ohhh. Good idea, I'll dig around SRA for a bit and see if there's
something that can be re-used.
I'm starting to wonder if what we have is a 3-part series.
[1/3] The basic tracking to handle 33562, possibly included in gcc-6
[2/3] Ignore reads that reference stuff not in live_bytes for gcc-7
[3/3] Detect partially dead aggregate stores, rewriting the partially
dead store to only store the live bytes. Also for gcc-7.
Obviously [1/3] would need compile-time benchmarking, but I really don't
expect any issues there.
So what's the overall statistic result on [1/3] if you exclude the clobbers?
Very few, call it a dozen, all in libstdc++. They weren't significantly
different than ssa-dse-9.c, so I didn't try to build nice reduced
testcases for them given we've got existing coverage.
One could argue that with the few real world cases that 33562 could be
punted to P4 and and patch series deferred to gcc-7. I wouldn't lose
sleep over that option.
Jeff