On Sat, Nov 12, 2011 at 3:19 AM, H.J. Lu <hongjiu...@intel.com> wrote:
> The current x32 implementation uses LEAs to convert 32bit address to > 64bit. However, we can use addr32 prefix to use 32bit address directly. > It improves performance by 5% in SPEC CPU 2K/2006. All changes are done > in x86 backend, except for a smaill unwind library assert change: > > http://gcc.gnu.org/ml/gcc-patches/2011-11/msg01555.html > > due to return column size difference. > > For x86-64, Pmode can be 32bit or 64bit, but word_mode is always 64bit. > push/pop only work on word_mode. Also string instructions take Pmode > pointers. > > I will submit a set of patches to use 32bit Pmode for x32. This is > the first patch to properly use Pmode and word_mode. It also adds > addr32 prefix to string instructions if needed. OK for trunk? First round of review comments: @@ -10252,14 +10260,18 @@ ix86_expand_prologue (void) if (r10_live && eax_live) { t = choose_baseaddr (m->fs.sp_offset - allocate); - emit_move_insn (r10, gen_frame_mem (Pmode, t)); + emit_move_insn (gen_rtx_REG (word_mode, R10_REG), + gen_frame_mem (word_mode, t)); t = choose_baseaddr (m->fs.sp_offset - allocate - UNITS_PER_WORD); - emit_move_insn (eax, gen_frame_mem (Pmode, t)); + emit_move_insn (gen_rtx_REG (word_mode, AX_REG), + gen_frame_mem (word_mode, t)); } else if (eax_live || r10_live) { t = choose_baseaddr (m->fs.sp_offset - allocate); - emit_move_insn ((eax_live ? eax : r10), gen_frame_mem (Pmode, t)); + emit_move_insn (gen_rtx_REG (word_mode, + (eax_live ? AX_REG : R10_REG)), + gen_frame_mem (word_mode, t)); } } gcc_assert (m->fs.sp_offset == frame.stack_pointer_offset); Please just change rtx eax = gen_rtx_REG (Pmode, AX_REG); and r10 = gen_rtx_REG (Pmode, R10_REG); around line 10305 and line 10324. You also have gen_push in Pmode, just following the former line. Please review the whole ix86_expand_prologue how AX and R10 are defined and used. @@ -11060,8 +11072,8 @@ ix86_expand_split_stack_prologue (void) { rtx rax; - rax = gen_rtx_REG (Pmode, AX_REG); - emit_move_insn (rax, reg10); + rax = gen_rtx_REG (word_mode, AX_REG); + emit_move_insn (rax, gen_rtx_REG (word_mode, R10_REG)); use_reg (&call_fusage, rax); } Same here. Please review how AX, R10 and R11 are defined and used. Also, this needs review from split stack author. @@ -11388,6 +11400,11 @@ ix86_decompose_address (rtx addr, struct ix86_address *out) else disp = addr; /* displacement */ + /* Since address override works only on the (reg) part in fs:(reg), + we can't use it as memory operand. */ + if (Pmode != word_mode && seg == SEG_FS && (base || index)) + return 0; Can you explain the above some more? IMO, if the override works on (reg) part, this is just what we want. @@ -13637,7 +13665,8 @@ ix86_print_operand (FILE *file, rtx x, int code) gcc_unreachable (); } - ix86_print_operand (file, x, 0); + ix86_print_operand (file, x, + TARGET_64BIT && REG_P (x) ? 'q' : 0); return; This is too big hammer. You output everything in DImode, so even if the address is in fact in SImode, you output it in DImode with an addr32 prefix. Uros.