"Fu, Chao-Ying" <[EMAIL PROTECTED]> writes:
>   After tracing GCC 4.x to see why MADD is not generated for MIPS32,
> I found out the main issue is that the pattern "adddi3"
> is not available for MIPS32.  Because the missing
> of adddi3, GCC 4.x needs to split 64-bit addition to 4 separate
> RTL insns.  This leads to that the combining phase fails
> to combine RTL insns to a single madd pattern.
>
>   Could we enable "adddi3" for MIPS32 in GCC 4.x?  Or is there a 
> better way to generate MADD?  Thanks a lot!

The problem with:

> Ex: (mips.md in GCC 3.4)
> (define_expand "adddi3"
>   [(parallel [(set (match_operand:DI 0 "register_operand" "")
>                    (plus:DI (match_operand:DI 1 "register_operand" "")
>                             (match_operand:DI 2 "arith_operand" "")))
>               (clobber (match_dup 3))])]
>   "TARGET_64BIT || (!TARGET_DEBUG_G_MODE && !TARGET_MIPS16)"
> {
> ....
>
> (define_insn "adddi3_internal_1"
>   [(set (match_operand:DI 0 "register_operand" "=d,&d")
>         (plus:DI (match_operand:DI 1 "register_operand" "0,d")
>                  (match_operand:DI 2 "register_operand" "d,d")))
>    (clobber (match_operand:SI 3 "register_operand" "=d,d"))]
>   "!TARGET_64BIT && !TARGET_DEBUG_G_MODE && !TARGET_MIPS16"
> {
>   return (REGNO (operands[0]) == REGNO (operands[1])
>           && REGNO (operands[0]) == REGNO (operands[2]))
>     ? "srl\t%3,%L0,31\;sll\t%M0,%M0,1\;sll\t%L0,%L1,1\;addu\t%M0,%M0,%3"
>     : 
> "addu\t%L0,%L1,%L2\;sltu\t%3,%L0,%L2\;addu\t%M0,%M1,%M2\;addu\t%M0,%M0,%3";
> }
>   [(set_attr "type"     "darith")
>    (set_attr "mode"     "DI")
>    (set_attr "length"   "16")])

...this was that it tended to be very poor for the additions themselves.
When optabs.c implements the additions instead, the early RTL optimisers
get to see the individual instructions, and are often able to handle
constant or part-constant operands better.  This led to a noticable
size improvement when I tested it originally.  (I imagine the effects
are even better now, thanks to the subreg lowering pass.)  See:

    http://gcc.gnu.org/ml/gcc-patches/2004-05/msg00947.html

for the patch that made this change, and some rationale.

As far as madd goes, I think it would be better to either
(a) get combine to handle this situation or (b) get expand
to generate a fused multiply-add from the outset.

(b) sounds like it might be useful in its own right.  At the moment we
treat the generation of floating-point multiply-adds as an optimisation,
but in some applications it's critical not to round the intermediate
result.  (I don't know if there's a bugzilla entry about this.)
If we treated fused multiply-add as a primitive operation, we could
extend it to integer types too.  In this case we'd also need to
handle widening multiplications, but we already need to do that
for stand-alone multiplications.

Just random musings, and probably not the answer you wanted to hear,
sorry.

Richard

Reply via email to