On 1 June 2011 15:14, Richard Guenther <richard.guent...@gmail.com> wrote: > On Wed, Jun 1, 2011 at 1:37 PM, Ira Rosen <ira.ro...@linaro.org> wrote: >> On 1 June 2011 12:42, Richard Guenther <richard.guent...@gmail.com> wrote: >> >>> Did you think about moving pass_optimize_widening_mul before >>> loop optimizations? Does that pass catch the cases you are >>> teaching the pattern recognizer? I think we should try to expose >>> these more complicated instructions to loop optimizers. >>> >> >> pass_optimize_widening_mul doesn't catch these cases, but I can try to >> teach it instead of the vectorizer. >> I am now testing >> >> Index: passes.c >> =================================================================== >> --- passes.c (revision 174391) >> +++ passes.c (working copy) >> @@ -870,6 +870,7 @@ >> NEXT_PASS (pass_split_crit_edges); >> NEXT_PASS (pass_pre); >> NEXT_PASS (pass_sink_code); >> + NEXT_PASS (pass_optimize_widening_mul); >> NEXT_PASS (pass_tree_loop); >> { >> struct opt_pass **p = &pass_tree_loop.pass.sub; >> @@ -934,7 +935,6 @@ >> NEXT_PASS (pass_forwprop); >> NEXT_PASS (pass_phiopt); >> NEXT_PASS (pass_fold_builtins); >> - NEXT_PASS (pass_optimize_widening_mul); >> NEXT_PASS (pass_tail_calls); >> NEXT_PASS (pass_rename_ssa_copies); >> NEXT_PASS (pass_uncprop); >> >> to see how it affects other loop optimizations (vectorizer pattern >> tests obviously fail).
Looks like it needs copy_prop and dce as well: Index: passes.c =================================================================== --- passes.c (revision 174391) +++ passes.c (working copy) @@ -870,6 +870,9 @@ NEXT_PASS (pass_split_crit_edges); NEXT_PASS (pass_pre); NEXT_PASS (pass_sink_code); + NEXT_PASS (pass_copy_prop); + NEXT_PASS (pass_dce); + NEXT_PASS (pass_optimize_widening_mul); NEXT_PASS (pass_tree_loop); { struct opt_pass **p = &pass_tree_loop.pass.sub; @@ -934,7 +937,6 @@ NEXT_PASS (pass_forwprop); NEXT_PASS (pass_phiopt); NEXT_PASS (pass_fold_builtins); - NEXT_PASS (pass_optimize_widening_mul); NEXT_PASS (pass_tail_calls); NEXT_PASS (pass_rename_ssa_copies); NEXT_PASS (pass_uncprop); otherwise I get (on x86_64-suse-linux) FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddss FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmaddsd FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubss FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfmsubsd FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddss FAIL: gcc.target/i386/fma4-fma-2.c scan-assembler vfnmaddsd Ira > > Thanks. I would hope that we eventually can get rid of the > pattern recognizer ... at least for SSE there is also always > a scalar variant instruction for each vectorized one. > > Richard. >