I noticed a few failing SPEC benchs on a AVX2 machine which is because when now doing BB vectorization on a failed-to-vectorize if-converted loop body we can end up with unvectorized masked loads/stores. As BB vectorization does not handle masked loads/stores at all the following patch simply avoids doing BB vectorization.
Bootstrap / regtest in progress. I'll try to isolate a testcase later. Richard. 2016-11-24 Richard Biener <rguent...@suse.de> PR tree-optimization/78396 * tree-vectorizer.c (vectorize_loops): When the if-converted body contains masked loads or stores do not attempt to basic-block-vectorize it. Index: gcc/tree-vectorizer.c =================================================================== --- gcc/tree-vectorizer.c (revision 242834) +++ gcc/tree-vectorizer.c (working copy) @@ -570,14 +570,22 @@ vectorize_epilogue: && ! loop->inner) { basic_block bb = loop->header; + bool has_mask_load_store = false; for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) { gimple *stmt = gsi_stmt (gsi); + if (gimple_call_internal_p (stmt) + && (gimple_call_internal_fn (stmt) == IFN_MASK_LOAD + || gimple_call_internal_fn (stmt) == IFN_MASK_STORE)) + { + has_mask_load_store = true; + break; + } gimple_set_uid (stmt, -1); gimple_set_visited (stmt, false); } - if (vect_slp_bb (bb)) + if (! has_mask_load_store && vect_slp_bb (bb)) { dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location, "basic block vectorized\n");