pshor...@dataworx.com.au writes: > On 17.04.2014 13:00, Jeff Law wrote: >> On 04/16/14 16:19, Richard Henderson wrote: >>> >>> The register allocators only select an alternative for a move. They >>> do not >>> choose between N different patterns, separately describing loads, >>> stores, and >>> register-to-register movement. >>> >>> I'm fairly sure the documentation is quite clear on this, and GCC had >>> required >>> this since the beginning of time. >> Correct on all counts; many an hour was spent reimplementing the PA >> movXX patterns to satisfy that requirement. >> >> jeff > I'm convinced :-) but... > > gcc internals info about movm is fairly comprehensive and I had taken > care to ensure that I satisfied ... > > "The constraints on a ‘movm’ must permit moving any hard register to > any other hard register provided..." > > by providing a define_expand that assigns from a general_operand to a > nonimmediate_operand and ... > > *ldsi instruction that can load from a general_operand to a > nonimmediate_operand > and a > *storesi instruction that can store a register_operand to a > memory_operand
Must admit I can't find where this is documented from a quick look. Like Jeff and Richard, I could swear it was documented somewhere... > In any case, out of curiosity and to convice myself I hadn't imagined > the old reload pass handling this I reverted my recent fixes so that > ldsi and storesi were once again as described above then repeated the > exercise with full rtl dumping on and compared the rtl generated both > with and without LRA enabled. > > In both cases the *.ira dmp produced the triggering ... > > (insn 57 61 58 5 (set (reg/v:SI 46 [orig:31 s ] [31]) > (reg/v:SI 31 [ s ])) 48 {*ldsi} > (expr_list:REG_DEAD (reg/v:SI 31 [ s ]) > (nil))) > > The non-LRA reload rtl produced .. > > (insn 57 61 67 3 (set (reg:SI 1 r1) > (mem/c:SI (plus:HI (reg/f:HI 3 r3) > (const_int 4 [0x4])) [4 %sfp+4 S4 A16])) 48 {*ldsi} > (nil)) > (insn 67 57 58 3 (set (mem/c:SI (plus:HI (reg/f:HI 3 r3) > (const_int 4 [0x4])) [4 %sfp+4 S4 A16]) > (reg:SI 1 r1)) 47 {*storesi} > (nil)) > > While the LRA just got stuck in a loop unable to perform the reload of > insn 57 that the old reload pass handled (or more correctly didn't choke > over - it seems to be a redundant load/store). > > I'm really just highlighting this because I know the LRA is quite young > and this might be a hint towards a deeper/other issues. I think LRA just exploits the rule more than reload did. Reload would look at the original instruction as a move between two MEMs, pick a hard register for the destination of the *ldsi, then emit a store from the hard register to the original MEM destination, on the assumption that that move must be possible. (It would never actually match against *storesi contraints, or even know that *storesi is the instruction that would be used.) It was then down to pot luck whether that hard register gets rused ("inherited") by later instructions that need (reg/v:SI 46 [orig:31 s ] [31]). LRA tries to take a more global view by introducing new pseudo registers to represent as-yet unallocated reload registers. As you say, a consequence of this is that you can see moves between two pseudo registers that match the load pattern predicates but in which the destination is destined to be a MEM. Thanks, Richard