On Tue, 2016-04-12 at 10:00 +0930, Alan Modra wrote: > On Fri, Apr 08, 2016 at 01:41:05PM -0700, Richard Henderson wrote: > > On 04/08/2016 11:10 AM, Bill Schmidt wrote: > > > The first is an issue with TOC-relative addresses on PowerPC. These are > > > symbolic addresses that are to be loaded from a fixed slot in the table > > > of contents, as addressed by the TOC pointer (r2). In the RTL phases > > > prior to register allocation, these are described in an UNSPEC that > > > looks like this for an example store: > > > > > > (set (mem/c:DI (unspec:DI [ > > > (symbol_ref:DI ("*.LANCHOR0") [flags 0x182]) > > > (reg:DI 2 2) > > > ] UNSPEC_TOCREL) [1 svul+0 S8 A128]) > > > (reg:DI 178)) > > > > > > The UNSPEC helps keep track of the r2 reference until this is split into > > > two or more insns depending on the memory model. > > > > > > That's why Alpha uses LO_SUM for pre-reload tracking of such things. > > > > Even though that's a bit of a liberty, since there's no HIGH to go along > > with > > the LO_SUM. But at least it allows the middle-end to continue to find the > > symbol. > > I wish I'd been made aware of the problem with alias analysis when I > invented this scheme for -mcmodel=medium code..
It's certainly subtle. I had to be pretty lucky to discover it, as the only effect is to rather harmlessly say "who knows" rather than giving a definite answer. > > Back in gcc-4.3 days, when small-model code was the only option, we > used to generate > mem (plus ((reg 2) (const (minus ((symbol_ref) > (symbol_ref toc_base)))))) > for a toc mem reference, which accurately reflects the addressing. > > The problem is that when splitting this to a high/lo_sum you lose the > r2 reference in the lo_sum, and that allows r2 to die prematurely, > breaking an important linker code editing optimisation. > > Hmm. Maybe if we rewrote the mem to > mem (plus ((symbol_ref toc_base) (const (minus ((symbol_ref) > (reg 2)))))) > It might look odd, but is no lie. r2 is equal to toc_base. Or > perhaps we could lie a litte and simply omit the plus and toc_base > reference? > > Either way, when we split to > set (reg tmp) (high (const (minus ((symbol_ref) (reg 2))))) > .. mem (lo_sum (reg tmp) (const (minus ((symbol_ref) (reg 2))))) > both high and lo_sum reference r2 and the linker could happily replace > rtmp in the lo_sum insn with r2 when the high address is known to be > zero. Yes, this sounds promising. And it really helps to know the history here -- you saved me a lot of digging through the archives, since I didn't want to rediscover the issue behind the present design. > > Bill, do you have test cases for the alias problem? Is this something > that needs fixing for gcc-6? > Last question first ... no, I don't think it does. It's generally fine for the structural aliasing to report "I don't know" and let other checks decide whether aliasing can exist; it just isn't optimal. I only spotted this because getting past this check allowed me to run into a problem in my code that was exposed in the TBAA checks afterwards. I ran into this with an experimental patch for GCC 7. I can send you a copy of the patch, and point you to the test in the test suite that exhibits the problem when that patch is applied. I'll do that offline. Thanks, Alan! Bill