When we have the DSP multiply extension (armv5e and up for ARM or T2), then a 16-bit multiply is best done with smulbb rather than mul. This saves us doing redundant extends and may even be a faster instruction on some cores. Unfortunately, expand does not try this operation by default; but it's trivial to add to the back-end. Since the value produced is extended, we represent this by expanding into the widening mul version and then subreg-ing the result.

        * arm.md (mulhi3): New expand pattern.

R.
--- gcc/config/arm/arm.md       (revision 201327)
+++ gcc/config/arm/arm.md       (local)
@@ -1725,6 +1725,20 @@ (define_expand "subdf3"
 
 ;; Multiplication insns
 
+(define_expand "mulhi3"
+  [(set (match_operand:HI 0 "s_register_operand" "")
+       (mult:HI (match_operand:HI 1 "s_register_operand" "")
+                (match_operand:HI 2 "s_register_operand" "")))]
+  "TARGET_DSP_MULTIPLY"
+  "
+  {
+    rtx result = gen_reg_rtx (SImode);
+    emit_insn (gen_mulhisi3 (result, operands[1], operands[2]));
+    emit_move_insn (operands[0], gen_lowpart (HImode, result));
+    DONE;
+  }"
+)
+
 (define_expand "mulsi3"
   [(set (match_operand:SI          0 "s_register_operand" "")
        (mult:SI (match_operand:SI 2 "s_register_operand" "")

Reply via email to