Thanks, this addresses most of my comments from the v8 review.
There were a couple left over though:

chenglulu <chengl...@loongson.cn> writes:
> +(define_attr "compression" "none,all"
> +  (const_string "none"))

I still don't understand the purpose of keeping this for LoongArch.

> +(define_insn "truncdisi2_extended"
> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=ZC")
> +     (truncate:SI (match_operand:DI 1 "register_operand" "r")))]
> +  "TARGET_64BIT"
> +  "stptr.w\t%1,%0"
> +  [(set_attr "move_type" "store")
> +   (set_attr "mode" "SI")])

Does this pattern ever match, now that TRULY_NOOP_TRUNCATION is 1?
It looks like it could be deleted.

It looks like this version removes:

< +(define_insn "*mulsi3_extended"
< +  [(set (match_operand:DI 0 "register_operand" "=r")
< +     (sign_extend:DI
< +         (mult:SI (match_operand:SI 1 "register_operand" "r")
< +                  (match_operand:SI 2 "register_operand" "r"))))]
< +  "TARGET_64BIT"
< +  "mul.w\t%0,%1,%2"
< +  [(set_attr "type" "imul")
< +   (set_attr "mode" "SI")])

but that pattern should still be useful.

On:

> >> +(define_split
> >> +  [(match_operand 0 "small_data_pattern")]
> >> +  "reload_completed"
> >> +  [(match_dup 0)]
> >> +  { operands[0] = loongarch_rewrite_small_data (operands[0]); })
> >> +
> >> +
> >> +;; Match paired HI/SI/SF/DFmode load/stores.
> >> +(define_insn "*join2_load_store<JOIN_MODE:mode>"
> >> +  [(set (match_operand:JOIN_MODE 0 "nonimmediate_operand"
> >> +  "=r,f,m,m,r,ZC,r,k,f,k")
> >> +  (match_operand:JOIN_MODE 1 "nonimmediate_operand" 
> >> "m,m,r,f,ZC,r,k,r,k,f"))
> >> +   (set (match_operand:JOIN_MODE 2 "nonimmediate_operand"
> >> +   "=r,f,m,m,r,ZC,r,k,f,k")
> >> +  (match_operand:JOIN_MODE 3 "nonimmediate_operand" 
> >> "m,m,r,f,ZC,r,k,r,k,f"))]
> >> +  "reload_completed"
> >> +  {
> >> +    bool load_p = (which_alternative == 0 || which_alternative == 1);
> >> +    /* Reg-renaming pass reuses base register if it is dead after bonded 
> >> loads.
> >> +       Hardware does not bond those loads, even when they are consecutive.
> >> +       However, order of the loads need to be checked for correctness.  */
> >> +    if (!load_p || !reg_overlap_mentioned_p (operands[0], operands[1]))
> >> +      {
> > I'm not sure I understand how these patterns work, but it looks like the
> > condition above is trying to work around a later change to the insn by
> > regrename, after peephole2 has checked loongarch_load_store_bonding_p.
> > If so, you should be able to avoid that by marking the destinations of
> > the loads as earlyclobbers, using "&r" instead of "r" for the first
> > alternative.  regrename should then preserve the conditions that
> > loongarch_load_store_bonding_p checked earlier.
> >
> > Same for the other patterns.
> >
> Hi,
>
> I think peephole pass is after reload pass, so peephole pass don't need '&'.

When making changes to instructions, all post-reload passes have to honour
the "&" (via constrain_operands).

My point was: it looks like *join2_load_store<JOIN_MODE:mode> is only
created by peephole2s, and those peephole2s check that there is no
overlap for loads.  So the condition above:

   !load_p || !reg_overlap_mentioned_p (operands[0], operands[1])

should always be true for the original form of the instruction
(as created by peephole2).  So the question then is: when is the
condition quoted above false?  And I think a possible answer is:
if regrename changes the registers in the instruction after peephole2
has created it.  Adding "&" would force regrename to ensure that the
load destination does not overlap the source, so that the condition
above would always be true.

So if you add the "&"s, it should be possible to turn the condition
above into an assert and only have one version of the asm output,
instead of two.

LGTM otherwise.

Thanks,
Richard

Reply via email to