> From: Matthew Fortune
> > +    /* The third argument needs to be in SImode in order to succesfully
> > match
> > +       the operand from the insn definition.  */
> 
> Please refer to operand here not argument as it is the second argument
> to the builtin but third operand of the instruction.  Also double ss in
> successfully.
> 

I have rewritten the comment to address these mistakes.

> > +    case CODE_FOR_loongson_pshufh:
> > +    case CODE_FOR_loongson_psllh:
> > +    case CODE_FOR_loongson_psllw:
> > +    case CODE_FOR_loongson_psrah:
> > +    case CODE_FOR_loongson_psraw:
> > +    case CODE_FOR_loongson_psrlh:
> > +    case CODE_FOR_loongson_psrlw:
> > +      gcc_assert (has_target_p && nops == 3 && ops[2].mode == QImode);
> > +      ops[2].value = lowpart_subreg (SImode, ops[2].value, QImode);
> > +      ops[2].mode = SImode;
> > +      break;
> > +
> >      case CODE_FOR_msa_addvi_b:
> >      case CODE_FOR_msa_addvi_h:
> >      case CODE_FOR_msa_addvi_w:
> 
> For the record, given paradoxical subregs are a headache...
> I am OK with this on the basis that the argument to psllh etc is actually
> a uint8_t which means that bits 8 upwards are guaranteed to be zero so
> the subreg can be eliminated without any explicit sign or zero extension
> inserted.  This is the same kind of optimisation that combine would
> perform when eliminating zero extension.
> 
> Please can you check that a zero extension is inserted for the following
> case with -O2 or above:
> 
> int16x4_t testme(int16x4_t s, int amount)
> {
>   return psllh_s (s, amount);
> }
> 
> If my understanding is correct there should be an ANDI 0xff inserted
> or similar.
> 

The ANDI 0xff is present for -O0, after the first time the value is loaded
from memory, but it is not generated for -O1 and -O2.
I'm not seeing any zero extension happening for -O1 and -O2.

The only change in the patch below is the fixed comment.

Regards,
Toma

gcc/

        * config/mips/mips.c (mips_expand_builtin_insn): Put the QImode
        argument of the pshufh, psllh, psllw, psrah, psraw, psrlh, psrlw
        builtins into an SImode paradoxical SUBREG.

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index da7fa8f..e5b2d9a 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -16574,6 +16574,20 @@ mips_expand_builtin_insn (enum insn_code icode, 
unsigned int nops,
 
   switch (icode)
     {
+    /* The third operand of these instructions is in SImode, so we need to
+       bring the corresponding builtin argument from QImode into SImode.  */
+    case CODE_FOR_loongson_pshufh:
+    case CODE_FOR_loongson_psllh:
+    case CODE_FOR_loongson_psllw:
+    case CODE_FOR_loongson_psrah:
+    case CODE_FOR_loongson_psraw:
+    case CODE_FOR_loongson_psrlh:
+    case CODE_FOR_loongson_psrlw:
+      gcc_assert (has_target_p && nops == 3 && ops[2].mode == QImode);
+      ops[2].value = lowpart_subreg (SImode, ops[2].value, QImode);
+      ops[2].mode = SImode;
+      break;
+
     case CODE_FOR_msa_addvi_b:
     case CODE_FOR_msa_addvi_h:
     case CODE_FOR_msa_addvi_w:

Reply via email to