On Mon, Feb 22, 2016 at 5:32 PM, Jeff Law <l...@redhat.com> wrote: > On 02/22/2016 07:32 AM, Richard Biener wrote: > >>> Presumably DOM is not looking at r = s.r and realizing it could look s.r >>> piece-wise in the available expression table. If it did, it would >>> effectively turn that fragment into: >>> >>> s = { {1, 2}, 3 }; >>> s.r.x = 1; >>> s.r.y = 2; >>> struct R r = {1, 2} >>> s.z = 3; >>> >>> At which point we no longer have the may-read of s.r.{x,y} and DSE would >>> see >>> the initial assignment as dead. >> >> >> Yeah, but if it does not become dead you just increased code size or >> lifetime... > > Increasing lifetimes is inherent in just about any CSE optimization. But as > I mentioned, I'm not sure trying to add this aggregate handling to DOM is > wise. > >> >> FRE does something related - it looks at all piecewise uses of 'r' and >> eventually replaces them with pieces of s.r when seeing the r = s.r >> aggregate assignment. Of course that only makes the store to r dead if >> there >> are no uses of it left. > > *If* we were to try and do something similar in DOM, we'd probably want to > try and share much of the infrastructure. I'll keep the FRE code in mind. > >> >>> I also looked a bit at cases where we find that while an entire store >>> (such >>> as an aggregate initialization or mem*) may not be dead, pieces of the >>> store >>> may be dead. That's trivial to detect. It triggers relatively often. >>> The trick is once detected, we have to go back and rewrite the original >>> statement to only store the live parts. I've only written the detection >>> code, the rewriting might be somewhat painful. >> >> >> Yes. I think SRA has all the code to do that though, see how it >> does scalarization of constant pool loads like > > Ohhh. Good idea, I'll dig around SRA for a bit and see if there's something > that can be re-used. > >>> I'm starting to wonder if what we have is a 3-part series. >>> >>> [1/3] The basic tracking to handle 33562, possibly included in gcc-6 >>> [2/3] Ignore reads that reference stuff not in live_bytes for gcc-7 >>> [3/3] Detect partially dead aggregate stores, rewriting the partially >>> dead store to only store the live bytes. Also for gcc-7. >>> >>> >>> Obviously [1/3] would need compile-time benchmarking, but I really don't >>> expect any issues there. >> >> >> So what's the overall statistic result on [1/3] if you exclude the >> clobbers? > > Very few, call it a dozen, all in libstdc++. They weren't significantly > different than ssa-dse-9.c, so I didn't try to build nice reduced testcases > for them given we've got existing coverage. > > One could argue that with the few real world cases that 33562 could be > punted to P4 and and patch series deferred to gcc-7. I wouldn't lose sleep > over that option.
Way to go at this point IMHO. That is, keep at P2 please, P4 is for sth else. It's not a P1 blocker anyway. Richard. > Jeff >