This fixes a missed vectorization (missed MAX_EXPR detection actually). At some point I made the phiprop pass not transform a load if it wasn't obvious that the loads would be "direct" at the end. But this pessimizes the case in question as it's not easy to verify if forwprop will later combine the dereference and a non-invariant address.
So the patch removes that restriction again but arranges for phiprop to run right before forwprop so that after that and FRE, phiopt has a chance to optimize to the MAX_EXPR. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2013-05-23 Richard Biener <rguent...@suse.de> PR tree-optimization/57380 * tree-ssa-phiprop.c (propagate_with_phi): Do not require at least one invariant or re-used load. * passes.c (init_optimization_passes): Move pass_phiprop before pass_forwprop. * g++.dg/tree-ssa/pr57380.C: New testcase. Index: gcc/tree-ssa-phiprop.c =================================================================== *** gcc/tree-ssa-phiprop.c (revision 199199) --- gcc/tree-ssa-phiprop.c (working copy) *************** propagate_with_phi (basic_block bb, gimp *** 247,253 **** ssa_op_iter i; bool phi_inserted; tree type = NULL_TREE; - bool one_invariant = false; if (!POINTER_TYPE_P (TREE_TYPE (ptr)) || !is_gimple_reg_type (TREE_TYPE (TREE_TYPE (ptr)))) --- 247,252 ---- *************** propagate_with_phi (basic_block bb, gimp *** 282,298 **** if (!type && TREE_CODE (arg) == SSA_NAME) type = TREE_TYPE (phivn[SSA_NAME_VERSION (arg)].value); - if (TREE_CODE (arg) == ADDR_EXPR - && is_gimple_min_invariant (arg)) - one_invariant = true; } - /* If we neither have an address of a decl nor can reuse a previously - inserted load, do not hoist anything. */ - if (!one_invariant - && !type) - return false; - /* Find a dereferencing use. First follow (single use) ssa copy chains for ptr. */ while (single_imm_use (ptr, &use, &use_stmt) --- 281,288 ---- Index: gcc/passes.c =================================================================== *** gcc/passes.c (revision 199199) --- gcc/passes.c (working copy) *************** init_optimization_passes (void) *** 1402,1413 **** NEXT_PASS (pass_ccp); /* After CCP we rewrite no longer addressed locals into SSA form if possible. */ NEXT_PASS (pass_forwprop); /* pass_build_alias is a dummy pass that ensures that we execute TODO_rebuild_alias at this point. */ NEXT_PASS (pass_build_alias); NEXT_PASS (pass_return_slot); - NEXT_PASS (pass_phiprop); NEXT_PASS (pass_fre); NEXT_PASS (pass_copy_prop); NEXT_PASS (pass_merge_phi); --- 1402,1413 ---- NEXT_PASS (pass_ccp); /* After CCP we rewrite no longer addressed locals into SSA form if possible. */ + NEXT_PASS (pass_phiprop); NEXT_PASS (pass_forwprop); /* pass_build_alias is a dummy pass that ensures that we execute TODO_rebuild_alias at this point. */ NEXT_PASS (pass_build_alias); NEXT_PASS (pass_return_slot); NEXT_PASS (pass_fre); NEXT_PASS (pass_copy_prop); NEXT_PASS (pass_merge_phi); Index: gcc/testsuite/g++.dg/tree-ssa/pr57380.C =================================================================== *** gcc/testsuite/g++.dg/tree-ssa/pr57380.C (revision 0) --- gcc/testsuite/g++.dg/tree-ssa/pr57380.C (working copy) *************** *** 0 **** --- 1,21 ---- + /* { dg-do compile } */ + /* { dg-options "-O2 -fdump-tree-phiopt1" } */ + + struct my_array { + int data[4]; + }; + + const int& my_max(const int& a, const int& b) { + return a < b ? b : a; + } + + int f(my_array a, my_array b) { + int res = 0; + for (int i = 0; i < 4; ++i) { + res += my_max(a.data[i], b.data[i]); + } + return res; + } + + /* { dg-final { scan-tree-dump "MAX_EXPR" "phiopt1" } } */ + /* { dg-final { cleanup-tree-dump "phiopt1" } } */