Good catch.  I agree with Hongtao that although my testing revealed
no problems with the previous version of this patch, it makes sense to
call gen_reg_rtx to generate an pseudo intermediate instead of attempting
to reuse the existing logic that uses ix86_gen_scratch_sse_rtx as an
intermediate.  I've left the existing behaviour the same, so that
memory-to-memory moves (continue to) use ix86_gen_scatch_sse_rtx.

This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap"
and "make -k check" with no new failures.

Ok for mainline?


2021-10-13  Roger Sayle  <ro...@nextmovesoftware.com>

gcc/ChangeLog
        * config/i386/i386-expand.c (ix86_expand_vector_move):  Use a
        pseudo intermediate when moving a SUBREG into a hard register,
        by checking ix86_hardreg_mov_ok.
        (ix86_expand_vector_extract): Store zero-extended SImode
        intermediate in a pseudo, then set target using a SUBREG_PROMOTED
        annotated subreg.
        * config/i386/sse.md (mov<VMOVE>_internal): Prevent CSE creating
        complex (SUBREG) sets of (vector) hard registers before reload, by
        checking ix86_hardreg_mov_ok.

Thanks,
Roger

-----Original Message-----
From: Hongtao Liu <crazy...@gmail.com> 
Sent: 11 October 2021 12:29
To: Roger Sayle <ro...@nextmovesoftware.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] x86_64: Some SUBREG related optimization tweaks to i386 
backend.

On Mon, Oct 11, 2021 at 4:55 PM Roger Sayle <ro...@nextmovesoftware.com> wrote:
> gcc/ChangeLog
>         * config/i386/i386-expand.c (ix86_expand_vector_move):  Use a
>         pseudo intermediate when moving a SUBREG into a hard register,
>         by checking ix86_hardreg_mov_ok.

   /* Make operand1 a register if it isn't already.  */
   if (can_create_pseudo_p ()
-      && !register_operand (op0, mode)
-      && !register_operand (op1, mode))
+      && (!ix86_hardreg_mov_ok (op0, op1)  || (!register_operand (op0, 
+ mode)
+      && !register_operand (op1, mode))))
     {
       rtx tmp = ix86_gen_scratch_sse_rtx (GET_MODE (op0));

ix86_gen_scratch_sse_rtx probably returns a hard register, but here you want a 
pseudo register.

--
BR,
Hongtao

diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 3e6f7d8e..4a8fa2f 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -615,6 +615,16 @@ ix86_expand_vector_move (machine_mode mode, rtx operands[])
       return;
     }
 
+  /* If operand0 is a hard register, make operand1 a pseudo.  */
+  if (can_create_pseudo_p ()
+      && !ix86_hardreg_mov_ok (op0, op1))
+    {
+      rtx tmp = gen_reg_rtx (GET_MODE (op0));
+      emit_move_insn (tmp, op1);
+      emit_move_insn (op0, tmp);
+      return;
+    }
+
   /* Make operand1 a register if it isn't already.  */
   if (can_create_pseudo_p ()
       && !register_operand (op0, mode)
@@ -16005,11 +16015,15 @@ ix86_expand_vector_extract (bool mmx_ok, rtx target, 
rtx vec, int elt)
       /* Let the rtl optimizers know about the zero extension performed.  */
       if (inner_mode == QImode || inner_mode == HImode)
        {
+         rtx reg = gen_reg_rtx (SImode);
          tmp = gen_rtx_ZERO_EXTEND (SImode, tmp);
-         target = gen_lowpart (SImode, target);
+         emit_move_insn (reg, tmp);
+         tmp = gen_lowpart (inner_mode, reg);
+         SUBREG_PROMOTED_VAR_P (tmp) = 1;
+         SUBREG_PROMOTED_SET (tmp, 1);
        }
 
-      emit_insn (gen_rtx_SET (target, tmp));
+      emit_move_insn (target, tmp);
     }
   else
     {
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 4559b0c..e43f597 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1270,7 +1270,8 @@
         " C,<sseconstm1>,vm,v"))]
   "TARGET_SSE
    && (register_operand (operands[0], <MODE>mode)
-       || register_operand (operands[1], <MODE>mode))"
+       || register_operand (operands[1], <MODE>mode))
+   && ix86_hardreg_mov_ok (operands[0], operands[1])"
 {
   switch (get_attr_type (insn))
     {

Reply via email to