Richard Henderson wrote: > On 07/27/2011 10:00 AM, Georg-Johann Lay wrote: >> Richard Henderson wrote: >>>> +;; "*ashluqihiqi3.mem" >>>> +;; "*ashlsqihiqi3.mem" >>>> +(define_insn_and_split "*ashl<extend_prefix>qihiqi3.mem" >>>> + [(set (match_operand:QI 0 "memory_operand" "=m") >>>> + (subreg:QI (ashift:HI (any_extend:HI (match_operand:QI 1 >>>> "register_operand" "r")) >>>> + (match_operand:QI 2 "register_operand" "r")) >>>> + 0))] >>>> + "!reload_completed" >>>> + { gcc_unreachable(); } >>> Surely this isn't necessary. Why would you ever be matching a memory >>> output? >>> >>>> +(define_insn_and_split "*ashlhiqi3" >>>> + [(set (match_operand:QI 0 "nonimmediate_operand" "=r") >>>> + (subreg:QI (ashift:HI (match_operand:HI 1 "register_operand" "0") >>>> + (match_operand:QI 2 "register_operand" >>>> "r")) 0))] >>>> + "!reload_completed" >>>> + { gcc_unreachable(); } >>> Likewise. >>> >>> But the first pattern and the peep2 look good. >>> >> It's that what combine comes up with, and combine is not smart enough >> to find a split point between the mem and the subreg. I don't know >> enough of combine, maybe it's because can_create_pseudo_p is false >> during combine, combine has no spare reg. A combine-split won't >> help as it needs a pseudo/spare reg. > > Hmm. Perhaps. Have you a test case for this? > > r~
char y; void shift2 (char x, unsigned char s) { y = x << s; } Combiner tries: ... Trying 9 -> 10: Failed to match this instruction: (set (mem/c/i:QI (symbol_ref:HI ("y") <var_decl 0xb764c180 y>) [0 y+0 S1 A8]) (subreg:QI (ashift:HI (reg:HI 48 [ x ]) (reg/v:QI 47 [ s ])) 0)) Trying 7, 9 -> 10: Failed to match this instruction: (set (mem/c/i:QI (symbol_ref:HI ("y") <var_decl 0xb764c180 y>) [0 y+0 S1 A8]) (subreg:QI (ashift:HI (zero_extend:HI (reg/v:QI 46 [ x ])) (reg/v:QI 47 [ s ])) 0)) Successfully matched this instruction: (set (reg:HI 50) (zero_extend:HI (reg/v:QI 46 [ x ]))) Failed to match this instruction: (set (mem/c/i:QI (symbol_ref:HI ("y") <var_decl 0xb764c180 y>) [0 y+0 S1 A8]) (subreg:QI (ashift:HI (reg:HI 50) (reg/v:QI 47 [ s ])) 0)) starting the processing of deferred insns ending the processing of deferred insns It sees that it can split out the zero_extend but it does not try to factor out the set, i.e. try to split like this where both instructions would match: (set (reg:QI foo) (subreg:QI (ashift:HI (reg:HI 50) (reg/v:QI 47 [ s ])) 0)) (set (mem/c/i:QI (symbol_ref:HI ("y") <var_decl 0xb764c180 y>) [0 y+0 S1 A8]) (reg:QI foo)) This is not specific the the test case, I see combine frequently to miss such opportunities even if just non-volatile memory is involved. If volatile memory is involved it's even worse because opportunities like load-modify-store, sign- and zero-extends or extracting/inserting are not detected, see PR49807. Johann