Thanks, this addresses most of my comments from the v8 review. There were a couple left over though:
chenglulu <chengl...@loongson.cn> writes: > +(define_attr "compression" "none,all" > + (const_string "none")) I still don't understand the purpose of keeping this for LoongArch. > +(define_insn "truncdisi2_extended" > + [(set (match_operand:SI 0 "nonimmediate_operand" "=ZC") > + (truncate:SI (match_operand:DI 1 "register_operand" "r")))] > + "TARGET_64BIT" > + "stptr.w\t%1,%0" > + [(set_attr "move_type" "store") > + (set_attr "mode" "SI")]) Does this pattern ever match, now that TRULY_NOOP_TRUNCATION is 1? It looks like it could be deleted. It looks like this version removes: < +(define_insn "*mulsi3_extended" < + [(set (match_operand:DI 0 "register_operand" "=r") < + (sign_extend:DI < + (mult:SI (match_operand:SI 1 "register_operand" "r") < + (match_operand:SI 2 "register_operand" "r"))))] < + "TARGET_64BIT" < + "mul.w\t%0,%1,%2" < + [(set_attr "type" "imul") < + (set_attr "mode" "SI")]) but that pattern should still be useful. On: > >> +(define_split > >> + [(match_operand 0 "small_data_pattern")] > >> + "reload_completed" > >> + [(match_dup 0)] > >> + { operands[0] = loongarch_rewrite_small_data (operands[0]); }) > >> + > >> + > >> +;; Match paired HI/SI/SF/DFmode load/stores. > >> +(define_insn "*join2_load_store<JOIN_MODE:mode>" > >> + [(set (match_operand:JOIN_MODE 0 "nonimmediate_operand" > >> + "=r,f,m,m,r,ZC,r,k,f,k") > >> + (match_operand:JOIN_MODE 1 "nonimmediate_operand" > >> "m,m,r,f,ZC,r,k,r,k,f")) > >> + (set (match_operand:JOIN_MODE 2 "nonimmediate_operand" > >> + "=r,f,m,m,r,ZC,r,k,f,k") > >> + (match_operand:JOIN_MODE 3 "nonimmediate_operand" > >> "m,m,r,f,ZC,r,k,r,k,f"))] > >> + "reload_completed" > >> + { > >> + bool load_p = (which_alternative == 0 || which_alternative == 1); > >> + /* Reg-renaming pass reuses base register if it is dead after bonded > >> loads. > >> + Hardware does not bond those loads, even when they are consecutive. > >> + However, order of the loads need to be checked for correctness. */ > >> + if (!load_p || !reg_overlap_mentioned_p (operands[0], operands[1])) > >> + { > > I'm not sure I understand how these patterns work, but it looks like the > > condition above is trying to work around a later change to the insn by > > regrename, after peephole2 has checked loongarch_load_store_bonding_p. > > If so, you should be able to avoid that by marking the destinations of > > the loads as earlyclobbers, using "&r" instead of "r" for the first > > alternative. regrename should then preserve the conditions that > > loongarch_load_store_bonding_p checked earlier. > > > > Same for the other patterns. > > > Hi, > > I think peephole pass is after reload pass, so peephole pass don't need '&'. When making changes to instructions, all post-reload passes have to honour the "&" (via constrain_operands). My point was: it looks like *join2_load_store<JOIN_MODE:mode> is only created by peephole2s, and those peephole2s check that there is no overlap for loads. So the condition above: !load_p || !reg_overlap_mentioned_p (operands[0], operands[1]) should always be true for the original form of the instruction (as created by peephole2). So the question then is: when is the condition quoted above false? And I think a possible answer is: if regrename changes the registers in the instruction after peephole2 has created it. Adding "&" would force regrename to ensure that the load destination does not overlap the source, so that the condition above would always be true. So if you add the "&"s, it should be possible to turn the condition above into an assert and only have one version of the asm output, instead of two. LGTM otherwise. Thanks, Richard