Hello!

Attached patch introduces generation of addr32 prefixed addresses,
mainly intended to merge ZERO_EXTRACTed LEA calculations into address.
 After fixing various inconsistencies with "o" constraints, the patch
works surprisingly well (in its current form fixes all reported
problems in the PR [1]), but one problem remains w.r.t. handling of
"o" constraint.

Patched gcc ICEs on gcc.dg/torture/pr47744-2.c with:

$ ~/gcc-build-fast/gcc/cc1 -O2 -mx32 -std=gnu99 -quiet pr47744-2.c
pr47744-2.c: In function ‘matmul_i16’:
pr47744-2.c:40:1: error: insn does not satisfy its constraints:
(insn 116 66 67 4 (set (reg:TI 0 ax)
        (mem:TI (zero_extend:DI (plus:SI (reg:SI 4 si [orig:114
ivtmp.26 ] [114])
                    (reg:SI 5 di [orig:101 dest_y ] [101]))) [6
MEM[base: dest_y_18, index: ivtmp.26_53, offset: 0B]+0 S16 A128]))
pr47744-2.c:34 60 {*movti_internal_rex64}
     (nil))
pr47744-2.c:40:1: internal compiler error: in
reload_cse_simplify_operands, at postreload.c:403
Please submit a full bug report,
...

... due to the fact that the address is not offsetable, and plus
((zero_extend (...)) (const_int ...)) gets rejected from
ix86_legitimate_address_p.

However, the section "16.8.1 Simple Constraints" of the documentation claims:

--quote--
   * A nonoffsettable memory reference can be reloaded by copying the
     address into a register.  So if the constraint uses the letter
     `o', all memory references are taken care of.
--/quote--

As I read this sentence, the RTX is forced into a temporary register,
and reload tries to satisfy "o" constraint with plus ((reg ...)
(const_int ...)), as said at the introduction of "o" constraint a
couple of pages earlier. Unfortunately, this does not seem to be the
case.

Is there anything wrong with my approach, or is there something wrong in reload?

2011-08-05  Uros Bizjak  <ubiz...@gmail.com>

        PR target/49781
        * config/i386/i386.c (ix86_decompose_address): Allow zero-extended
        SImode addresses.
        (ix86_print_operand_address): Handle zero-extended addresses.
        (memory_address_length): Add length of addr32 prefix for
        zero-extended addresses.
        * config/i386/predicates.md (lea_address_operand): Reject
        zero-extended operands.

Patch is otherwise bootstrapped and tested on x86_64-pc-linux-gnu
{,-m32} without regressions.

[1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781

Thanks,
Uros.
Index: config/i386/predicates.md
===================================================================
--- config/i386/predicates.md   (revision 177456)
+++ config/i386/predicates.md   (working copy)
@@ -801,6 +801,10 @@
   struct ix86_address parts;
   int ok;
 
+  /*  LEA handles zero-extend by itself.  */
+  if (GET_CODE (op) == ZERO_EXTEND)
+    return false;
+
   ok = ix86_decompose_address (op, &parts);
   gcc_assert (ok);
   return parts.seg == SEG_DEFAULT;
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c  (revision 177456)
+++ config/i386/i386.c  (working copy)
@@ -11146,6 +11146,14 @@ ix86_decompose_address (rtx addr, struct ix86_addr
   int retval = 1;
   enum ix86_address_seg seg = SEG_DEFAULT;
 
+  /* Allow zero-extended SImode addresses,
+     they will be emitted with addr32 prefix.  */
+  if (TARGET_64BIT
+      && GET_CODE (addr) == ZERO_EXTEND
+      && GET_MODE (addr) == DImode
+      && GET_MODE (XEXP (addr, 0)) == SImode)
+    addr = XEXP (addr, 0);
+ 
   if (REG_P (addr))
     base = addr;
   else if (GET_CODE (addr) == SUBREG)
@@ -14163,9 +14171,13 @@ ix86_print_operand_address (FILE *file, rtx addr)
     }
   else
     {
-      /* Print DImode registers on 64bit targets to avoid addr32 prefixes.  */
-      int code = TARGET_64BIT ? 'q' : 0;
+      int code = 0;
 
+      /* Print SImode registers for zero-extended addresses to force
+        addr32 prefix.  Otherwise print DImode registers to avoid it.  */
+      if (TARGET_64BIT)
+       code = (GET_CODE (addr) == ZERO_EXTEND) ? 'l' : 'q';
+
       if (ASSEMBLER_DIALECT == ASM_ATT)
        {
          if (disp)
@@ -21776,7 +21788,8 @@ assign_386_stack_local (enum machine_mode mode, en
 }
 
 /* Calculate the length of the memory address in the instruction
-   encoding.  Does not include the one-byte modrm, opcode, or prefix.  */
+   encoding.  Includes addr32 prefix, does not include the one-byte modrm,
+   opcode, or other prefixes.  */
 
 int
 memory_address_length (rtx addr)
@@ -21803,8 +21816,10 @@ memory_address_length (rtx addr)
   base = parts.base;
   index = parts.index;
   disp = parts.disp;
-  len = 0;
 
+  /* Add length of addr32 prefix.  */
+  len = (GET_CODE (addr) == ZERO_EXTEND);
+
   /* Rule of thumb:
        - esp as the base always wants an index,
        - ebp as the base always wants a displacement,

Reply via email to