On Sat, Nov 12, 2011 at 3:19 AM, H.J. Lu <hongjiu...@intel.com> wrote:

> The current x32 implementation uses LEAs to convert 32bit address to
> 64bit.  However, we can use addr32 prefix to use 32bit address directly.
> It improves performance by 5% in SPEC CPU 2K/2006.  All changes are done
> in x86 backend, except for a smaill unwind library assert change:
>
> http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01555.html
>
> due to return column size difference.
>
> For x86-64, Pmode can be 32bit or 64bit, but word_mode is always 64bit.
> push/pop only work on word_mode.  Also string instructions take Pmode
> pointers.
>
> I will submit a set of patches to use 32bit Pmode for x32.  This is
> the first patch to properly use Pmode and word_mode.  It also adds
> addr32 prefix to string instructions if needed.  OK for trunk?

First round of review comments:

@@ -10252,14 +10260,18 @@ ix86_expand_prologue (void)
       if (r10_live && eax_live)
         {
          t = choose_baseaddr (m->fs.sp_offset - allocate);
-         emit_move_insn (r10, gen_frame_mem (Pmode, t));
+         emit_move_insn (gen_rtx_REG (word_mode, R10_REG),
+                         gen_frame_mem (word_mode, t));
          t = choose_baseaddr (m->fs.sp_offset - allocate - UNITS_PER_WORD);
-         emit_move_insn (eax, gen_frame_mem (Pmode, t));
+         emit_move_insn (gen_rtx_REG (word_mode, AX_REG),
+                         gen_frame_mem (word_mode, t));
        }
       else if (eax_live || r10_live)
        {
          t = choose_baseaddr (m->fs.sp_offset - allocate);
-         emit_move_insn ((eax_live ? eax : r10), gen_frame_mem (Pmode, t));
+         emit_move_insn (gen_rtx_REG (word_mode,
+                                      (eax_live ? AX_REG : R10_REG)),
+                         gen_frame_mem (word_mode, t));
        }
     }
   gcc_assert (m->fs.sp_offset == frame.stack_pointer_offset);

Please just change

      rtx eax = gen_rtx_REG (Pmode, AX_REG);

and
          r10 = gen_rtx_REG (Pmode, R10_REG);

around line 10305 and line 10324. You also have gen_push in Pmode,
just following the former line. Please review the whole
ix86_expand_prologue how AX and R10 are defined and used.

@@ -11060,8 +11072,8 @@ ix86_expand_split_stack_prologue (void)
        {
          rtx rax;

-         rax = gen_rtx_REG (Pmode, AX_REG);
-         emit_move_insn (rax, reg10);
+         rax = gen_rtx_REG (word_mode, AX_REG);
+         emit_move_insn (rax, gen_rtx_REG (word_mode, R10_REG));
          use_reg (&call_fusage, rax);
        }

Same here. Please review how AX, R10 and R11 are defined and used.
Also, this needs review from split stack author.

@@ -11388,6 +11400,11 @@ ix86_decompose_address (rtx addr, struct
ix86_address *out)
   else
     disp = addr;                       /* displacement */

+  /* Since address override works only on the (reg) part in fs:(reg),
+     we can't use it as memory operand.  */
+  if (Pmode != word_mode && seg == SEG_FS && (base || index))
+    return 0;

Can you explain the above some more? IMO, if the override works on
(reg) part, this is just what we want.

@@ -13637,7 +13665,8 @@ ix86_print_operand (FILE *file, rtx x, int code)
              gcc_unreachable ();
            }

-         ix86_print_operand (file, x, 0);
+         ix86_print_operand (file, x,
+                             TARGET_64BIT && REG_P (x) ? 'q' : 0);
          return;

This is too big hammer. You output everything in DImode, so even if
the address is in fact in SImode, you output it in DImode with an
addr32 prefix.

Uros.

Reply via email to