https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78041
--- Comment #7 from Bernd Edlinger <bernd.edlinger at hotmail dot de> --- (In reply to Richard Earnshaw from comment #6) > (In reply to Bernd Edlinger from comment #5) > > (In reply to Wilco from comment #4) > > > However dealing with partial overlaps is complex so maybe the best option > > > would be to add alternatives to <shift>di3_neon to either allow full > > > overlap > > > "r 0 X X X" or no overlap "&r r X X X". The shift code works with full > > > overlap. > > > > That sounds like a good idea. > > > > Then this condition in <shift>di3_neon could go away too: > > > > && (!reg_overlap_mentioned_p (operands[0], operands[1]) > > || REGNO (operands[0]) == REGNO (operands[1]))) > > Note that we don't want to restrict complete overlaps, only partial > overlaps. Restricting complete overlaps leads to significant increase in > register pressure and a lot of redundant copying. Yes. That is Wilco's idea: instead of =r 0r X X X use =r 0 X X X and =&r r X X X, that should ensure that no partial overlap happens, just full overlap or nothing. That's what arm_emit_coreregs_64bit_shift and arm_ashldi3_1bit can handle. Who will do it?