When we have the DSP multiply extension (armv5e and up for ARM or T2),
then a 16-bit multiply is best done with smulbb rather than mul. This
saves us doing redundant extends and may even be a faster instruction on
some cores. Unfortunately, expand does not try this operation by
default; but it's trivial to add to the back-end. Since the value
produced is extended, we represent this by expanding into the widening
mul version and then subreg-ing the result.
* arm.md (mulhi3): New expand pattern.
R.
--- gcc/config/arm/arm.md (revision 201327)
+++ gcc/config/arm/arm.md (local)
@@ -1725,6 +1725,20 @@ (define_expand "subdf3"
;; Multiplication insns
+(define_expand "mulhi3"
+ [(set (match_operand:HI 0 "s_register_operand" "")
+ (mult:HI (match_operand:HI 1 "s_register_operand" "")
+ (match_operand:HI 2 "s_register_operand" "")))]
+ "TARGET_DSP_MULTIPLY"
+ "
+ {
+ rtx result = gen_reg_rtx (SImode);
+ emit_insn (gen_mulhisi3 (result, operands[1], operands[2]));
+ emit_move_insn (operands[0], gen_lowpart (HImode, result));
+ DONE;
+ }"
+)
+
(define_expand "mulsi3"
[(set (match_operand:SI 0 "s_register_operand" "")
(mult:SI (match_operand:SI 2 "s_register_operand" "")