Hi,

On Mon, 5 Mar 2007, Roman Zippel wrote:

> > Yes, it is in general better now to split double-word length
> > operations before reload.  It's not necessarily better to split as
> > early as possible, as that will essentially disable the RTL level loop
> > optimizations. 
> 
> I was worried about that too, since some patterns would be more 
> complicated, which may make some optimizations more difficult, e.g. where 
> I noticed it first was mulsidi3, which currently looks like this:

Ok, I played a little more with umulsidi3 (see attached patch, only the 
umulsidi3 pattern are relevant for this) and I think found a bug somwhere, 
although I'm not sure whether the subreg pass causes it or only trigggers 
it (it goes away with -fno-split-wide-types).

I removed the '%' from the constraint to test proper reloads (which may 
not be here, but probably in other cases) and used the following testcase:

unsigned long long f(void)
{
        register unsigned int a asm ("%d0");
        register unsigned int b asm ("%d1");
        asm ("# %0  %1" : "=d" (a), "=d" (b));
        return ((unsigned long long)b * (unsigned long long)a);
}

This is the situation prior to subreg2:

(insn 32 10 30 2 (parallel [
            (set (strict_low_part (subreg:SI (reg:DI 33) 4))
                (mult:SI (reg/v:SI 0 %d0 [ a ])
                    (reg/v:SI 1 %d1 [ b ])))
            (set (strict_low_part (subreg:SI (reg:DI 33) 0))
                (truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (reg/v:SI 0 
%d0 [ a ]))
                            (zero_extend:DI (reg/v:SI 1 %d1 [ b ])))
                        (const_int 32 [0x20]))))
        ]) -1 (nil)
    (expr_list:REG_DEAD (reg/v:SI 0 %d0 [ a ])
        (expr_list:REG_DEAD (reg/v:SI 1 %d1 [ b ])
            (nil))))

(insn 30 32 31 2 (set (reg:SI 0 %d0 [ <result> ])
        (subreg:SI (reg:DI 33) 0)) 31 {*m68k.md:673} (insn_list:REG_DEP_TRUE 11 
(nil))
    (nil))

(insn 31 30 23 2 (set (reg:SI 1 %d1 [orig:0 <result>+4 ] [0])
        (subreg:SI (reg:DI 33) 4)) 31 {*m68k.md:673} (nil)
    (expr_list:REG_DEAD (reg:DI 33)
        (nil)))

(insn 23 31 0 2 (use (reg/i:DI 0 %d0 [ <result> ])) -1 (insn_list:REG_DEP_TRUE 
30 (nil))
    (nil))


subreg2 produces this (which looks correct):

(insn 32 10 30 2 (parallel [
            (set (strict_low_part (reg:SI 38 [+4 ]))
                (mult:SI (reg/v:SI 0 %d0 [ a ])
                    (reg/v:SI 1 %d1 [ b ])))
            (set (strict_low_part (reg:SI 37))
                (truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (reg/v:SI 0 
%d0 [ a ]))
                            (zero_extend:DI (reg/v:SI 1 %d1 [ b ])))
                        (const_int 32 [0x20]))))
        ]) 181 {*umulsidi3_subreg} (nil)
    (expr_list:REG_DEAD (reg/v:SI 0 %d0 [ a ])
        (expr_list:REG_DEAD (reg/v:SI 1 %d1 [ b ])
            (nil))))

(insn 30 32 31 2 (set (reg:SI 0 %d0 [ <result> ])
        (reg:SI 37)) 31 {*m68k.md:673} (insn_list:REG_DEP_TRUE 11 (nil))
    (expr_list:REG_DEAD (reg:SI 37)
        (nil)))

(insn 31 30 23 2 (set (reg:SI 1 %d1 [orig:0 <result>+4 ] [0])
        (reg:SI 38 [+4 ])) 31 {*m68k.md:673} (nil)
    (expr_list:REG_DEAD (reg:SI 38 [+4 ])
        (nil)))

(insn 23 31 0 2 (use (reg/i:DI 0 %d0 [ <result> ])) -1 (insn_list:REG_DEP_TRUE 
30 (nil))
    (nil))


Reload has now to match (reg %d0) and (reg 38) above in insn 32 and after 
pseudo register replacement it looks like this:

(insn 32 10 30 2 (parallel [
            (set (strict_low_part (reg:SI 1 %d1 [orig:38+4 ] [38]))
                (mult:SI (reg/v:SI 0 %d0 [ a ])
                    (reg/v:SI 1 %d1 [ b ])))
            (set (strict_low_part (reg:SI 0 %d0 [37]))
                (truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (reg/v:SI 0 
%d0 [ a ]))
                            (zero_extend:DI (reg/v:SI 1 %d1 [ b ])))
                        (const_int 32 [0x20]))))
        ]) 181 {*umulsidi3_subreg} (nil)
    (expr_list:REG_DEAD (reg/v:SI 0 %d0 [ a ])
        (expr_list:REG_DEAD (reg/v:SI 1 %d1 [ b ])
            (nil))))

Notice that the REG_DEAD notes are not correct anymore, reload produces 
now the following reload:

Reloads for insn # 32
Reload 0: reload_in (SI) = (reg/v:SI 0 %d0 [ a ])
        reload_out (SI) = (reg:SI 1 %d1 [orig:38+4 ] [38])
        DATA_REGS, RELOAD_OTHER (opnum = 0)
        reload_in_reg: (reg/v:SI 0 %d0 [ a ])
        reload_out_reg: (reg:SI 1 %d1 [orig:38+4 ] [38])
        reload_reg_rtx: (reg/v:SI 0 %d0 [ a ])

The problem here is it can't use %d0 as reload register (but it does it 
anyway, because it thinks it's dead), as it's used for (reg 37) and reload 
produces this:

(insn 32 10 34 2 (parallel [
            (set (strict_low_part (reg/v:SI 0 %d0 [ a ]))
                (mult:SI (reg/v:SI 0 %d0 [ a ])
                    (reg/v:SI 1 %d1 [ b ])))
            (set (strict_low_part (reg:SI 0 %d0 [37]))
                (truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (reg/v:SI 0 
%d0 [ a ]))
                            (zero_extend:DI (reg/v:SI 1 %d1 [ b ])))
                        (const_int 32 [0x20]))))
        ]) 181 {*umulsidi3_subreg} (nil)
    (nil))

%d0 is used in both set destinations and postreload dies because of it.
The question is now, who is responsible for telling reload that %d0/%d1 
are not really dead anymore after register allocation?

bye, Roman
---
 gcc/config/m68k/m68k.md |  104 ++++++++++++++++++++++++------------------------
 1 file changed, 54 insertions(+), 50 deletions(-)

Index: gcc/gcc/config/m68k/m68k.md
===================================================================
--- gcc.orig/gcc/config/m68k/m68k.md
+++ gcc/gcc/config/m68k/m68k.md
@@ -2672,77 +2672,81 @@
 ;; proper matching constraint.  This is because the matching is between
 ;; the high-numbered word of the DImode operand[0] and operand[1].
 (define_expand "umulsidi3"
+  [(set (match_operand:DI 0 "register_operand" "")
+       (mult:DI (zero_extend:DI (match_operand:SI 1 "register_operand" ""))
+                (zero_extend:DI (match_operand:SI 2 "register_operand" ""))))]
+  "TARGET_68020 && !TUNE_68060 && !TARGET_COLDFIRE"
+  "")
+
+(define_insn_and_split "*umulsidi3"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+       (mult:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "%0"))
+                (zero_extend:DI (match_operand:SI 2 "nonimmediate_operand" 
"dm"))))]
+  "TARGET_68020 && !TUNE_68060 && !TARGET_COLDFIRE"
+  "#"
+  ""
   [(parallel
-    [(set (subreg:SI (match_operand:DI 0 "register_operand" "") 4)
-         (mult:SI (match_operand:SI 1 "register_operand" "")
-                  (match_operand:SI 2 "register_operand" "")))
-     (set (subreg:SI (match_dup 0) 0)
+    [(set (strict_low_part (match_dup 3))
+         (mult:SI (match_dup 1) (match_dup 2)))
+     (set (strict_low_part (match_dup 4))
          (truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (match_dup 1))
                                             (zero_extend:DI (match_dup 2)))
                                    (const_int 32))))])]
-  "TARGET_68020 && !TUNE_68060 && !TARGET_COLDFIRE"
-  "")
+  "split_di(operands, 1, operands + 3, operands + 4);")
 
-(define_insn ""
-  [(set (match_operand:SI 0 "register_operand" "=d")
-       (mult:SI (match_operand:SI 1 "register_operand" "%0")
-                 (match_operand:SI 2 "nonimmediate_operand" "dm")))
-   (set (match_operand:SI 3 "register_operand" "=d")
-       (truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (match_dup 1))
-                                          (zero_extend:DI (match_dup 2)))
+(define_insn "*umulsidi3_subreg"
+  [(set (strict_low_part (match_operand:SI 0 "register_operand" "=d"))
+       (mult:SI (match_operand:SI 2 "register_operand" "0")
+                (match_operand:SI 3 "nonimmediate_operand" "dm")))
+   (set (strict_low_part (match_operand:SI 1 "register_operand" "+d"))
+       (truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (match_dup 2))
+                                          (zero_extend:DI (match_dup 3)))
                                  (const_int 32))))]
   "TARGET_68020 && !TUNE_68060 && !TARGET_COLDFIRE"
-  "mulu%.l %2,%3:%0")
+  "mulu%.l %3,%1:%0")
 
 ; Match immediate case.  For 2.4 only match things < 2^31.
 ; It's tricky with larger values in these patterns since we need to match
 ; values between the two parallel multiplies, between a CONST_DOUBLE and
 ; a CONST_INT.
-(define_insn ""
-  [(set (match_operand:SI 0 "register_operand" "=d")
-       (mult:SI (match_operand:SI 1 "register_operand" "%0")
-                (match_operand:SI 2 "const_int_operand" "n")))
-   (set (match_operand:SI 3 "register_operand" "=d")
-       (truncate:SI (lshiftrt:DI (mult:DI (zero_extend:DI (match_dup 1))
-                                          (match_dup 2))
-                                 (const_int 32))))]
+;;(define_insn "*umulsidi3_const"
+;;  [(set (match_operand:DI 0 "register_operand" "=d")
+;;     (mult:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "0"))
+;;              (match_operand:SI 2 "const_int_operand" "n")))]
+;;  "TARGET_68020 && !TUNE_68060 && !TARGET_COLDFIRE"
+;;  "mulu%.l %2,%0:%R0")
+
+(define_insn "*umulsidi3_const"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+       (mult:DI (zero_extend:DI (match_operand:SI 1 "register_operand" "%0"))
+                (match_operand:DI 2 "const_double_operand" "n")))]
   "TARGET_68020 && !TUNE_68060 && !TARGET_COLDFIRE
-   && (unsigned) INTVAL (operands[2]) <= 0x7fffffff"
-  "mulu%.l %2,%3:%0")
+   && ((CONST_INT_P (operands[2])
+       && INTVAL (operands[2]) >= 0)
+       || (GET_CODE (operands[2]) == CONST_DOUBLE
+          && CONST_DOUBLE_HIGH (operands[2]) == 0))"
+  "mulu%.l %2,%0:%R0")
 
 (define_expand "mulsidi3"
-  [(parallel
-    [(set (subreg:SI (match_operand:DI 0 "register_operand" "") 4)
-         (mult:SI (match_operand:SI 1 "register_operand" "")
-                  (match_operand:SI 2 "register_operand" "")))
-     (set (subreg:SI (match_dup 0) 0)
-         (truncate:SI (lshiftrt:DI (mult:DI (sign_extend:DI (match_dup 1))
-                                            (sign_extend:DI (match_dup 2)))
-                                   (const_int 32))))])]
+  [(set (match_operand:DI 0 "register_operand" "")
+       (mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" ""))
+                (sign_extend:DI (match_operand:SI 2 "register_operand" ""))))]
   "TARGET_68020 && !TUNE_68060 && !TARGET_COLDFIRE"
   "")
 
-(define_insn ""
-  [(set (match_operand:SI 0 "register_operand" "=d")
-       (mult:SI (match_operand:SI 1 "register_operand" "%0")
-                (match_operand:SI 2 "nonimmediate_operand" "dm")))
-   (set (match_operand:SI 3 "register_operand" "=d")
-       (truncate:SI (lshiftrt:DI (mult:DI (sign_extend:DI (match_dup 1))
-                                          (sign_extend:DI (match_dup 2)))
-                                 (const_int 32))))]
+(define_insn "*mulsidi3"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+       (mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" "%0"))
+                (sign_extend:DI (match_operand:SI 2 "nonimmediate_operand" 
"dm"))))]
   "TARGET_68020 && !TUNE_68060 && !TARGET_COLDFIRE"
-  "muls%.l %2,%3:%0")
+  "muls%.l %2,%0:%R0")
 
-(define_insn ""
-  [(set (match_operand:SI 0 "register_operand" "=d")
-       (mult:SI (match_operand:SI 1 "register_operand" "%0")
-                (match_operand:SI 2 "const_int_operand" "n")))
-   (set (match_operand:SI 3 "register_operand" "=d")
-       (truncate:SI (lshiftrt:DI (mult:DI (sign_extend:DI (match_dup 1))
-                                          (match_dup 2))
-                                 (const_int 32))))]
+(define_insn "*mulsidi3_const"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+       (mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand" "%0"))
+                (match_operand:SI 2 "const_int_operand" "n")))]
   "TARGET_68020 && !TUNE_68060 && !TARGET_COLDFIRE"
-  "muls%.l %2,%3:%0")
+  "muls%.l %2,%0:%R0")
 
 (define_expand "umulsi3_highpart"
   [(parallel

Reply via email to