This patch is part bug fix, part better optimization.
Firstly, my initial patch series introduced a bug that caused an internal compiler error when the input to a multiply was a constant. This was caused by the gimple verification rejecting such things. I'm not totally clear how this ever worked, but I've corrected it by inserting a temporary SSA_NAME between the constant and the multiply.
I also discovered that widening multiply-and-accumulate operations were not recognised if any one of the three inputs were a constant. I've corrected this by adjusting the pattern matching. This also required inserting new SSA_NAMEs to make it work.
In order to insert the new SSA_NAME, I've simply reused the existing type conversion code - the only difference is that the conversion may be a no-op, so it just generates a straight forward assignment.
OK? Andrew
2011-07-21 Andrew Stubbs <a...@codesourcery.com> gcc/ * tree-ssa-math-opts.c (is_widening_mult_rhs_p): Handle constants beyond conversions. (convert_mult_to_widen): Create SSA_NAME for constant inputs. (convert_plusminus_to_widen): Don't automatically reject inputs that are not an SSA_NAME. Create SSA_NAME for constant inputs. gcc/testsuite/ * gcc.target/arm/wmul-11.c: New file. * gcc.target/arm/wmul-12.c: New file. * gcc.target/arm/wmul-13.c: New file. --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/wmul-11.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-require-effective-target arm_dsp } */ + +long long +foo (int *b) +{ + return 10 * (long long)*b; +} + +/* { dg-final { scan-assembler "smull" } } */ --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/wmul-12.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-require-effective-target arm_dsp } */ + +long long +foo (int *b, int *c) +{ + int tmp = *b * *c; + return 10 + (long long)tmp; +} + +/* { dg-final { scan-assembler "smlal" } } */ --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/wmul-13.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-require-effective-target arm_dsp } */ + +long long +foo (int *a, int *b) +{ + return *a + (long long)*b * 10; +} + +/* { dg-final { scan-assembler "smlal" } } */ --- a/gcc/tree-ssa-math-opts.c +++ b/gcc/tree-ssa-math-opts.c @@ -1997,6 +1997,13 @@ is_widening_mult_rhs_p (tree type, tree rhs, tree *type_out, type1 = TREE_TYPE (rhs1); } + if (TREE_CODE (rhs1) == INTEGER_CST) + { + *new_rhs_out = rhs1; + *type_out = NULL; + return true; + } + if (TREE_CODE (type1) != TREE_CODE (type) || TYPE_PRECISION (type1) * 2 > TYPE_PRECISION (type)) return false; @@ -2152,7 +2159,8 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi) for the opcode. This will be the full mode size. */ actual_precision = GET_MODE_PRECISION (actual_mode); if (actual_precision != TYPE_PRECISION (type1) - || from_unsigned1 != TYPE_UNSIGNED (type1)) + || from_unsigned1 != TYPE_UNSIGNED (type1) + || TREE_CODE (rhs1) != SSA_NAME) { tmp = create_tmp_var (build_nonstandard_integer_type (actual_precision, from_unsigned1), @@ -2160,7 +2168,8 @@ convert_mult_to_widen (gimple stmt, gimple_stmt_iterator *gsi) rhs1 = build_and_insert_cast (gsi, loc, tmp, rhs1); } if (actual_precision != TYPE_PRECISION (type2) - || from_unsigned2 != TYPE_UNSIGNED (type2)) + || from_unsigned2 != TYPE_UNSIGNED (type2) + || TREE_CODE (rhs2) != SSA_NAME) { /* Reuse the same type info, if possible. */ if (!tmp || from_unsigned1 != from_unsigned2) @@ -2221,8 +2230,6 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, if (is_gimple_assign (rhs1_stmt)) rhs1_code = gimple_assign_rhs_code (rhs1_stmt); } - else - return false; if (TREE_CODE (rhs2) == SSA_NAME) { @@ -2230,8 +2237,6 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, if (is_gimple_assign (rhs2_stmt)) rhs2_code = gimple_assign_rhs_code (rhs2_stmt); } - else - return false; /* Allow for one conversion statement between the multiply and addition/subtraction statement. If there are more than @@ -2358,7 +2363,8 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, for the opcode. This will be the full mode size. */ actual_precision = GET_MODE_PRECISION (actual_mode); if (actual_precision != TYPE_PRECISION (type1) - || from_unsigned1 != TYPE_UNSIGNED (type1)) + || from_unsigned1 != TYPE_UNSIGNED (type1) + || TREE_CODE (mult_rhs1) != SSA_NAME) { tmp = create_tmp_var (build_nonstandard_integer_type (actual_precision, from_unsigned1), @@ -2366,7 +2372,8 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt, mult_rhs1 = build_and_insert_cast (gsi, loc, tmp, mult_rhs1); } if (actual_precision != TYPE_PRECISION (type2) - || from_unsigned2 != TYPE_UNSIGNED (type2)) + || from_unsigned2 != TYPE_UNSIGNED (type2) + || TREE_CODE (mult_rhs2) != SSA_NAME) { if (!tmp || from_unsigned1 != from_unsigned2) tmp = create_tmp_var (build_nonstandard_integer_type