This is the final variant of the patch working towards enabling a less costly vectorization variant at -O2 by default. It introduces a "cheap" cost-model variant by means of transforming the existing -fvect-cost-model option to one taking an argument, "unlimited" (same as -fno-vect-cost-model), "dynamic" (same as -fvect-cost-model and the default) and "cheap".
With the "cheap" model we try to not disturb non-vectorized code, thus do not inhibit any PRE and do not perform if-conversion. We also avoid any loop versioning due to alignment or aliasing. This makes runtime performance of SPEC CPU 2006 not regress when comparing -O2 to -O2 -ftree-vectorize -fno-tree-vect-slp -fvect-cost-model=cheap. Few progressions remain, so do effects on compile-time and binary size (more data in [3/3]). Due to implementation bugs SLP is not viable for -O2 even though profitability should be way easier to assess for it. Independent on whether [3/3] will get positive feedback I'd like to push this patch in. Thus, comments welcome - as usual I'll interpret silence as positive feedback ;) Re-bootstrap / regtest running on x86_64-unknown-linux-gnu. Thanks, Richard. 2013-05-28 Richard Biener <rguent...@suse.de> common/ * config/i386/i386-common.c (ix86_option_init_struct): Do not enable OPT_fvect_cost_model. * common.opt (fvect-cost-model=): New option. (vect_cost_model): New enum and values. (fvect-cost-model): Alias to -fvect-cost-model=dynamic. (fno-vect-cost-model): Alias to -fvect-cost-model=unlimited. (ftree-vect-loop-version): Ignore. * opts.c (default_options_table): Do not set OPT_fvect_cost_model. (common_handle_option): Likewise. (finish_options): Adjust condition that sets PARAM_MAX_STORES_TO_SINK. * flag-types.h (enum vect_cost_model): New enum. * doc/invoke.texi (ftree-vect-loop-version): Remove. (fvect-cost-model): Adjust documentation. * targhooks.c (default_add_stmt_cost): Do not check flag_vect_cost_model. * tree-vectorizer.h (struct _loop_vec_info): Add cost model field. (struct _bb_vec_info): Likewise. (vectorizer_cost_model): Declare. * tree-vect-data-refs.c (vect_peeling_hash_insert): Check the loops cost-model flag. (vect_peeling_hash_choose_best_peeling): Likewise. (vect_enhance_data_refs_alignment): Likewise. Do not check flag_tree_vect_loop_version but check the cost model. (vect_mark_for_runtime_alias_test): Do not add runtime alias checks for the cheap cost model. * tree-vect-loop.c (vect_analyze_loop): Initialize the loops cost model flag. (vect_estimate_min_profitable_iters): Use the loops cost model flag. * tree-vect-slp.c (vect_slp_analyze_bb_1): Initialize and use the BBs cost model flag. * tree-vectorizer.c (gate_vect_slp): Adjust. (vectorizer_cost_model): Return the active cost model. * Makefile.in (tree-if-conv.o): Depend on $(TREE_VECTORIZER_H). (tree-ssa-pre.o): Likewise. * tree-if-conv.c: Include tree-vectorizer.h. (gate_tree_if_conversion): Enable if-conversion via the vectorizer only if the cost-model is not cheap. * tree-ssa-pre.c: Include tree-vectorizer.h. (inhibit_phi_insertion): Do not inhibit PHI insertion for the cheap vectorizer cost model. Index: trunk/gcc/common.opt =================================================================== *** trunk.orig/gcc/common.opt 2013-05-17 10:55:39.000000000 +0200 --- trunk/gcc/common.opt 2013-05-28 14:14:39.265369281 +0200 *************** EnumValue *** 1304,1310 **** Enum(stack_reuse_level) String(none) Value(SR_NONE) ftree-loop-if-convert ! Common Report Var(flag_tree_loop_if_convert) Init(-1) Optimization Convert conditional jumps in innermost loops to branchless equivalents ftree-loop-if-convert-stores --- 1304,1310 ---- Enum(stack_reuse_level) String(none) Value(SR_NONE) ftree-loop-if-convert ! Common Report Var(flag_tree_loop_if_convert) Optimization Convert conditional jumps in innermost loops to branchless equivalents ftree-loop-if-convert-stores *************** Common RejectNegative Joined UInteger Va *** 2267,2282 **** -ftree-vectorizer-verbose=<number> This switch is deprecated. Use -fopt-info instead. ftree-slp-vectorize ! Common Report Var(flag_tree_slp_vectorize) Init(2) Optimization Enable basic block vectorization (SLP) on trees fvect-cost-model ! Common Report Var(flag_vect_cost_model) Optimization ! Enable use of cost model in vectorization ftree-vect-loop-version ! Common Report Var(flag_tree_vect_loop_version) Init(1) Optimization ! Enable loop versioning when doing loop vectorization on trees ftree-scev-cprop Common Report Var(flag_tree_scev_cprop) Init(1) Optimization --- 2267,2302 ---- -ftree-vectorizer-verbose=<number> This switch is deprecated. Use -fopt-info instead. ftree-slp-vectorize ! Common Report Var(flag_tree_slp_vectorize) Optimization Enable basic block vectorization (SLP) on trees + fvect-cost-model= + Common Joined RejectNegative Enum(vect_cost_model) Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT) + Specifies the cost model for vectorization + + Enum + Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown vectorizer cost model %qs) + + EnumValue + Enum(vect_cost_model) String(unlimited) Value(VECT_COST_MODEL_UNLIMITED) + + EnumValue + Enum(vect_cost_model) String(dynamic) Value(VECT_COST_MODEL_DYNAMIC) + + EnumValue + Enum(vect_cost_model) String(cheap) Value(VECT_COST_MODEL_CHEAP) + fvect-cost-model ! Common RejectNegative Alias(fvect-cost-model=,dynamic) ! Enables the dynamic vectorizer cost model. Preserved for backward compatibility. ! ! fno-vect-cost-model ! Common RejectNegative Alias(fvect-cost-model=,unlimited) ! Enables the unlimited vectorizer cost model. Preserved for backward compatibility. ftree-vect-loop-version ! Common Ignore ! Does nothing. Preserved for backward compatibility. ftree-scev-cprop Common Report Var(flag_tree_scev_cprop) Init(1) Optimization Index: trunk/gcc/opts.c =================================================================== *** trunk.orig/gcc/opts.c 2013-05-17 10:55:39.000000000 +0200 --- trunk/gcc/opts.c 2013-05-28 14:14:39.282369470 +0200 *************** static const struct default_options defa *** 498,504 **** { OPT_LEVELS_3_PLUS, OPT_funswitch_loops, NULL, 1 }, { OPT_LEVELS_3_PLUS, OPT_fgcse_after_reload, NULL, 1 }, { OPT_LEVELS_3_PLUS, OPT_ftree_vectorize, NULL, 1 }, - { OPT_LEVELS_3_PLUS, OPT_fvect_cost_model, NULL, 1 }, { OPT_LEVELS_3_PLUS, OPT_fipa_cp_clone, NULL, 1 }, { OPT_LEVELS_3_PLUS, OPT_ftree_partial_pre, NULL, 1 }, --- 498,503 ---- *************** finish_options (struct gcc_options *opts *** 823,831 **** } } ! /* Set PARAM_MAX_STORES_TO_SINK to 0 if either vectorization or if-conversion ! is disabled. */ ! if (!opts->x_flag_tree_vectorize || !opts->x_flag_tree_loop_if_convert) maybe_set_param_value (PARAM_MAX_STORES_TO_SINK, 0, opts->x_param_values, opts_set->x_param_values); --- 822,832 ---- } } ! /* Set PARAM_MAX_STORES_TO_SINK to 0 if vectorization is not enabled ! or if-conversion is explicitely disabled. */ ! if (!opts->x_flag_tree_vectorize ! || (opts_set->x_flag_tree_loop_if_convert ! && !opts->x_flag_tree_loop_if_convert)) maybe_set_param_value (PARAM_MAX_STORES_TO_SINK, 0, opts->x_param_values, opts_set->x_param_values); *************** common_handle_option (struct gcc_options *** 1597,1604 **** opts->x_flag_gcse_after_reload = value; if (!opts_set->x_flag_tree_vectorize) opts->x_flag_tree_vectorize = value; - if (!opts_set->x_flag_vect_cost_model) - opts->x_flag_vect_cost_model = value; if (!opts_set->x_flag_tree_loop_distribute_patterns) opts->x_flag_tree_loop_distribute_patterns = value; break; --- 1598,1603 ---- Index: trunk/gcc/common/config/i386/i386-common.c =================================================================== *** trunk.orig/gcc/common/config/i386/i386-common.c 2013-05-17 10:55:39.000000000 +0200 --- trunk/gcc/common/config/i386/i386-common.c 2013-05-28 14:14:39.309369766 +0200 *************** ix86_option_init_struct (struct gcc_opti *** 729,735 **** opts->x_flag_pcc_struct_return = 2; opts->x_flag_asynchronous_unwind_tables = 2; - opts->x_flag_vect_cost_model = 1; } /* On the x86 -fsplit-stack and -fstack-protector both use the same --- 729,734 ---- Index: trunk/gcc/flag-types.h =================================================================== *** trunk.orig/gcc/flag-types.h 2013-05-17 10:55:39.000000000 +0200 --- trunk/gcc/flag-types.h 2013-05-28 14:14:39.309369766 +0200 *************** enum fp_contract_mode { *** 191,194 **** --- 191,202 ---- FP_CONTRACT_FAST = 2 }; + /* Vectorizer cost-model. */ + enum vect_cost_model { + VECT_COST_MODEL_UNLIMITED = 0, + VECT_COST_MODEL_CHEAP = 1, + VECT_COST_MODEL_DYNAMIC = 2, + VECT_COST_MODEL_DEFAULT = 3 + }; + #endif /* ! GCC_FLAG_TYPES_H */ Index: trunk/gcc/targhooks.c =================================================================== *** trunk.orig/gcc/targhooks.c 2013-05-17 10:55:39.000000000 +0200 --- trunk/gcc/targhooks.c 2013-05-28 14:14:39.321369902 +0200 *************** default_add_stmt_cost (void *data, int c *** 1050,1070 **** { unsigned *cost = (unsigned *) data; unsigned retval = 0; ! if (flag_vect_cost_model) ! { ! tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE; ! int stmt_cost = default_builtin_vectorization_cost (kind, vectype, ! misalign); ! /* Statements in an inner loop relative to the loop being ! vectorized are weighted more heavily. The value here is ! arbitrary and could potentially be improved with analysis. */ ! if (where == vect_body && stmt_info && stmt_in_inner_loop_p (stmt_info)) ! count *= 50; /* FIXME. */ ! ! retval = (unsigned) (count * stmt_cost); ! cost[where] += retval; ! } return retval; } --- 1050,1066 ---- { unsigned *cost = (unsigned *) data; unsigned retval = 0; + tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE; + int stmt_cost = default_builtin_vectorization_cost (kind, vectype, + misalign); + /* Statements in an inner loop relative to the loop being + vectorized are weighted more heavily. The value here is + arbitrary and could potentially be improved with analysis. */ + if (where == vect_body && stmt_info && stmt_in_inner_loop_p (stmt_info)) + count *= 50; /* FIXME. */ ! retval = (unsigned) (count * stmt_cost); ! cost[where] += retval; return retval; } Index: trunk/gcc/tree-vect-data-refs.c =================================================================== *** trunk.orig/gcc/tree-vect-data-refs.c 2013-05-28 13:40:29.000000000 +0200 --- trunk/gcc/tree-vect-data-refs.c 2013-05-28 14:14:39.336370071 +0200 *************** vect_mark_for_runtime_alias_test (ddr_p *** 173,179 **** { struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); ! if ((unsigned) PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS) == 0) return false; if (dump_enabled_p ()) --- 173,180 ---- { struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); ! if (loop_vinfo->cost_model == VECT_COST_MODEL_CHEAP ! || (unsigned) PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS) == 0) return false; if (dump_enabled_p ()) *************** vect_peeling_hash_insert (loop_vec_info *** 1087,1093 **** *new_slot = slot; } ! if (!supportable_dr_alignment && !flag_vect_cost_model) slot->count += VECT_MAX_COST; } --- 1088,1095 ---- *new_slot = slot; } ! if (!supportable_dr_alignment ! && loop_vinfo->cost_model == VECT_COST_MODEL_UNLIMITED) slot->count += VECT_MAX_COST; } *************** vect_peeling_hash_choose_best_peeling (l *** 1197,1203 **** res.peel_info.dr = NULL; res.body_cost_vec = stmt_vector_for_cost(); ! if (flag_vect_cost_model) { res.inside_cost = INT_MAX; res.outside_cost = INT_MAX; --- 1199,1205 ---- res.peel_info.dr = NULL; res.body_cost_vec = stmt_vector_for_cost(); ! if (loop_vinfo->cost_model != VECT_COST_MODEL_UNLIMITED) { res.inside_cost = INT_MAX; res.outside_cost = INT_MAX; *************** vect_enhance_data_refs_alignment (loop_v *** 1426,1432 **** vectorization factor. We do this automtically for cost model, since we calculate cost for every peeling option. */ ! if (!flag_vect_cost_model) possible_npeel_number = vf /nelements; /* Handle the aligned case. We may decide to align some other --- 1428,1434 ---- vectorization factor. We do this automtically for cost model, since we calculate cost for every peeling option. */ ! if (loop_vinfo->cost_model == VECT_COST_MODEL_UNLIMITED) possible_npeel_number = vf /nelements; /* Handle the aligned case. We may decide to align some other *************** vect_enhance_data_refs_alignment (loop_v *** 1434,1440 **** if (DR_MISALIGNMENT (dr) == 0) { npeel_tmp = 0; ! if (!flag_vect_cost_model) possible_npeel_number++; } --- 1436,1442 ---- if (DR_MISALIGNMENT (dr) == 0) { npeel_tmp = 0; ! if (loop_vinfo->cost_model == VECT_COST_MODEL_UNLIMITED) possible_npeel_number++; } *************** vect_enhance_data_refs_alignment (loop_v *** 1743,1749 **** /* (2) Versioning to force alignment. */ /* Try versioning if: ! 1) flag_tree_vect_loop_version is TRUE 2) optimize loop for speed 3) there is at least one unsupported misaligned data ref with an unknown misalignment, and --- 1745,1751 ---- /* (2) Versioning to force alignment. */ /* Try versioning if: ! 1) cost model is not VECT_COST_MODEL_CHEAP 2) optimize loop for speed 3) there is at least one unsupported misaligned data ref with an unknown misalignment, and *************** vect_enhance_data_refs_alignment (loop_v *** 1751,1757 **** 5) the number of runtime alignment checks is within reason. */ do_versioning = ! flag_tree_vect_loop_version && optimize_loop_nest_for_speed_p (loop) && (!loop->inner); /* FORNOW */ --- 1753,1759 ---- 5) the number of runtime alignment checks is within reason. */ do_versioning = ! loop_vinfo->cost_model != VECT_COST_MODEL_CHEAP && optimize_loop_nest_for_speed_p (loop) && (!loop->inner); /* FORNOW */ Index: trunk/gcc/tree-vect-loop.c =================================================================== *** trunk.orig/gcc/tree-vect-loop.c 2013-05-28 13:47:04.000000000 +0200 --- trunk/gcc/tree-vect-loop.c 2013-05-28 14:14:39.337370082 +0200 *************** vect_analyze_loop (struct loop *loop) *** 1763,1768 **** --- 1763,1770 ---- return NULL; } + loop_vinfo->cost_model = vectorizer_cost_model (); + if (vect_analyze_loop_2 (loop_vinfo)) { LOOP_VINFO_VECTORIZABLE_P (loop_vinfo) = 1; *************** vect_estimate_min_profitable_iters (loop *** 2636,2642 **** void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo); /* Cost model disabled. */ ! if (!flag_vect_cost_model) { dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled."); *ret_min_profitable_niters = 0; --- 2638,2644 ---- void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo); /* Cost model disabled. */ ! if (loop_vinfo->cost_model == VECT_COST_MODEL_UNLIMITED) { dump_printf_loc (MSG_NOTE, vect_location, "cost model disabled."); *ret_min_profitable_niters = 0; Index: trunk/gcc/tree-vect-slp.c =================================================================== *** trunk.orig/gcc/tree-vect-slp.c 2013-05-17 10:55:39.000000000 +0200 --- trunk/gcc/tree-vect-slp.c 2013-05-28 14:14:39.338370093 +0200 *************** vect_slp_analyze_bb_1 (basic_block bb) *** 1992,1997 **** --- 1992,2001 ---- if (!bb_vinfo) return NULL; + /* For BB vectorization it only matters whether the cost model is + enabled or disabled. */ + bb_vinfo->cost_model = vectorizer_cost_model (); + if (!vect_analyze_data_refs (NULL, bb_vinfo, &min_vf)) { if (dump_enabled_p ()) *************** vect_slp_analyze_bb_1 (basic_block bb) *** 2093,2099 **** } /* Cost model: check if the vectorization is worthwhile. */ ! if (flag_vect_cost_model && !vect_bb_vectorization_profitable_p (bb_vinfo)) { if (dump_enabled_p ()) --- 2097,2103 ---- } /* Cost model: check if the vectorization is worthwhile. */ ! if (bb_vinfo->cost_model != VECT_COST_MODEL_UNLIMITED && !vect_bb_vectorization_profitable_p (bb_vinfo)) { if (dump_enabled_p ()) Index: trunk/gcc/tree-vectorizer.c =================================================================== *** trunk.orig/gcc/tree-vectorizer.c 2013-05-17 10:55:39.000000000 +0200 --- trunk/gcc/tree-vectorizer.c 2013-05-28 14:21:57.417255737 +0200 *************** LOC vect_location; *** 73,78 **** --- 73,88 ---- /* Vector mapping GIMPLE stmt to stmt_vec_info. */ vec<vec_void_p> stmt_vec_info_vec; + /* Return the active vectorizer cost model. */ + + enum vect_cost_model + vectorizer_cost_model (void) + { + if (flag_vect_cost_model != VECT_COST_MODEL_DEFAULT) + return flag_vect_cost_model; + /* The default cost model is the dynamic one. */ + return VECT_COST_MODEL_DYNAMIC; + } /* Function vectorize_loops. *************** execute_vect_slp (void) *** 191,200 **** static bool gate_vect_slp (void) { ! /* Apply SLP either if the vectorizer is on and the user didn't specify ! whether to run SLP or not, or if the SLP flag was set by the user. */ ! return ((flag_tree_vectorize != 0 && flag_tree_slp_vectorize != 0) ! || flag_tree_slp_vectorize == 1); } struct gimple_opt_pass pass_slp_vectorize = --- 201,211 ---- static bool gate_vect_slp (void) { ! /* Apply SLP either according to whether the user specified whether to ! run SLP or not, or according to whether vectorization is enabled. */ ! if (global_options_set.x_flag_tree_slp_vectorize) ! return flag_tree_slp_vectorize != 0; ! return flag_tree_vectorize != 0; } struct gimple_opt_pass pass_slp_vectorize = Index: trunk/gcc/tree-vectorizer.h =================================================================== *** trunk.orig/gcc/tree-vectorizer.h 2013-05-17 10:56:07.000000000 +0200 --- trunk/gcc/tree-vectorizer.h 2013-05-28 14:14:39.338370093 +0200 *************** typedef struct _loop_vec_info { *** 314,319 **** --- 314,322 ---- fix it up. */ bool operands_swapped; + /* The cost model to be used for this loop. */ + enum vect_cost_model cost_model; + } *loop_vec_info; /* Access Functions. */ *************** typedef struct _bb_vec_info { *** 391,396 **** --- 394,402 ---- /* Cost data used by the target cost model. */ void *target_cost_data; + /* The cost model to be used for this BB. */ + enum vect_cost_model cost_model; + } *bb_vec_info; #define BB_VINFO_BB(B) (B)->bb *************** void vect_pattern_recog (loop_vec_info, *** 1010,1014 **** --- 1016,1021 ---- /* In tree-vectorizer.c. */ unsigned vectorize_loops (void); + enum vect_cost_model vectorizer_cost_model (void); #endif /* GCC_TREE_VECTORIZER_H */ Index: trunk/gcc/doc/invoke.texi =================================================================== *** trunk.orig/gcc/doc/invoke.texi 2013-05-28 13:00:55.000000000 +0200 --- trunk/gcc/doc/invoke.texi 2013-05-28 14:23:21.066189174 +0200 *************** Objective-C and Objective-C++ Dialects}. *** 419,428 **** -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-partial-pre -ftree-pta @gol -ftree-reassoc -ftree-sink -ftree-slsr -ftree-sra @gol -ftree-switch-conversion -ftree-tail-merge @gol ! -ftree-ter -ftree-vect-loop-version -ftree-vectorize -ftree-vrp @gol -funit-at-a-time -funroll-all-loops -funroll-loops @gol -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol ! -fvariable-expansion-in-unroller -fvect-cost-model -fvpt -fweb @gol -fwhole-program -fwpa -fuse-ld=@var{linker} -fuse-linker-plugin @gol --param @var{name}=@var{value} -O -O0 -O1 -O2 -O3 -Os -Ofast -Og} --- 419,428 ---- -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-partial-pre -ftree-pta @gol -ftree-reassoc -ftree-sink -ftree-slsr -ftree-sra @gol -ftree-switch-conversion -ftree-tail-merge @gol ! -ftree-ter -ftree-vectorize -ftree-vrp @gol -funit-at-a-time -funroll-all-loops -funroll-loops @gol -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol ! -fvariable-expansion-in-unroller -fvect-cost-model=@var{model} -fvpt -fweb @gol -fwhole-program -fwpa -fuse-ld=@var{linker} -fuse-linker-plugin @gol --param @var{name}=@var{value} -O -O0 -O1 -O2 -O3 -Os -Ofast -Og} *************** Optimize yet more. @option{-O3} turns o *** 6652,6658 **** by @option{-O2} and also turns on the @option{-finline-functions}, @option{-funswitch-loops}, @option{-fpredictive-commoning}, @option{-fgcse-after-reload}, @option{-ftree-vectorize}, - @option{-fvect-cost-model}, @option{-ftree-partial-pre} and @option{-fipa-cp-clone} options. @item -O0 --- 6652,6657 ---- *************** optimizations designed to reduce code si *** 6669,6675 **** @option{-Os} disables the following optimization flags: @gccoptlist{-falign-functions -falign-jumps -falign-loops @gol -falign-labels -freorder-blocks -freorder-blocks-and-partition @gol ! -fprefetch-loop-arrays -ftree-vect-loop-version} @item -Ofast @opindex Ofast --- 6668,6674 ---- @option{-Os} disables the following optimization flags: @gccoptlist{-falign-functions -falign-jumps -falign-loops @gol -falign-labels -freorder-blocks -freorder-blocks-and-partition @gol ! -fprefetch-loop-arrays} @item -Ofast @opindex Ofast *************** Perform loop vectorization on trees. Thi *** 7910,7928 **** Perform basic block vectorization on trees. This flag is enabled by default at @option{-O3} and when @option{-ftree-vectorize} is enabled. ! @item -ftree-vect-loop-version ! @opindex ftree-vect-loop-version ! Perform loop versioning when doing loop vectorization on trees. When a loop ! appears to be vectorizable except that data alignment or data dependence cannot ! be determined at compile time, then vectorized and non-vectorized versions of ! the loop are generated along with run-time checks for alignment or dependence ! to control which version is executed. This option is enabled by default ! except at level @option{-Os} where it is disabled. ! ! @item -fvect-cost-model @opindex fvect-cost-model ! Enable cost model for vectorization. This option is enabled by default at ! @option{-O3}. @item -ftree-vrp @opindex ftree-vrp --- 7909,7929 ---- Perform basic block vectorization on trees. This flag is enabled by default at @option{-O3} and when @option{-ftree-vectorize} is enabled. ! @item -fvect-cost-model=@var{model} @opindex fvect-cost-model ! Alter the cost model used for vectorization. The @var{model} argument ! should be one of @code{unlimited}, @code{dynamic} or @code{cheap}. ! With the @code{unlimited} model the vectorized code-path is assumed ! to be profitable while with the @code{dynamic} model a runtime check ! will guard the vectorized code-path to enable it only for iteration ! counts that will likely execute faster than when executing the original ! scalar loop. The @code{cheap} model will disable vectorization of ! loops where doing so would be cost prohibitive for example due to ! required runtime checks for data dependence or alignment but otherwise ! is equal to the @code{dynamic} model. The @code{cheap} model also ! disables enablement transforms that also may apply to loops that may ! not end up being vectorized. ! The default cost model is the @code{dynamic} one. @item -ftree-vrp @opindex ftree-vrp *************** constraints. The default value is 0. *** 9328,9340 **** @item vect-max-version-for-alignment-checks The maximum number of run-time checks that can be performed when ! doing loop versioning for alignment in the vectorizer. See option ! @option{-ftree-vect-loop-version} for more information. @item vect-max-version-for-alias-checks The maximum number of run-time checks that can be performed when ! doing loop versioning for alias in the vectorizer. See option ! @option{-ftree-vect-loop-version} for more information. @item max-iterations-to-track The maximum number of iterations of a loop the brute-force algorithm --- 9329,9339 ---- @item vect-max-version-for-alignment-checks The maximum number of run-time checks that can be performed when ! doing loop versioning for alignment in the vectorizer. @item vect-max-version-for-alias-checks The maximum number of run-time checks that can be performed when ! doing loop versioning for alias in the vectorizer. @item max-iterations-to-track The maximum number of iterations of a loop the brute-force algorithm Index: trunk/gcc/Makefile.in =================================================================== *** trunk.orig/gcc/Makefile.in 2013-05-22 12:29:31.000000000 +0200 --- trunk/gcc/Makefile.in 2013-05-28 14:23:57.959600836 +0200 *************** tree-ssa-pre.o : tree-ssa-pre.c $(TREE_F *** 2388,2394 **** $(CFGLOOP_H) alloc-pool.h $(BASIC_BLOCK_H) $(BITMAP_H) $(HASH_TABLE_H) \ $(GIMPLE_H) $(TREE_INLINE_H) tree-iterator.h tree-ssa-sccvn.h $(PARAMS_H) \ $(DBGCNT_H) tree-scalar-evolution.h $(GIMPLE_PRETTY_PRINT_H) domwalk.h \ ! $(IPA_PROP_H) tree-ssa-sccvn.o : tree-ssa-sccvn.c $(TREE_FLOW_H) $(CONFIG_H) \ $(SYSTEM_H) $(TREE_H) $(DIAGNOSTIC_H) \ $(TM_H) coretypes.h $(DUMPFILE_H) $(FLAGS_H) $(CFGLOOP_H) \ --- 2388,2394 ---- $(CFGLOOP_H) alloc-pool.h $(BASIC_BLOCK_H) $(BITMAP_H) $(HASH_TABLE_H) \ $(GIMPLE_H) $(TREE_INLINE_H) tree-iterator.h tree-ssa-sccvn.h $(PARAMS_H) \ $(DBGCNT_H) tree-scalar-evolution.h $(GIMPLE_PRETTY_PRINT_H) domwalk.h \ ! $(IPA_PROP_H) $(TREE_VECTORIZER_H) tree-ssa-sccvn.o : tree-ssa-sccvn.c $(TREE_FLOW_H) $(CONFIG_H) \ $(SYSTEM_H) $(TREE_H) $(DIAGNOSTIC_H) \ $(TM_H) coretypes.h $(DUMPFILE_H) $(FLAGS_H) $(CFGLOOP_H) \ *************** tree-nested.o: tree-nested.c $(CONFIG_H) *** 2435,2441 **** tree-if-conv.o: tree-if-conv.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ $(TREE_H) $(FLAGS_H) $(BASIC_BLOCK_H) $(TREE_FLOW_H) \ $(CFGLOOP_H) $(TREE_DATA_REF_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \ ! $(DBGCNT_H) $(GIMPLE_PRETTY_PRINT_H) tree-iterator.o : tree-iterator.c $(CONFIG_H) $(SYSTEM_H) $(TREE_H) \ coretypes.h $(GGC_H) tree-iterator.h $(GIMPLE_H) gt-tree-iterator.h tree-dfa.o : tree-dfa.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \ --- 2435,2441 ---- tree-if-conv.o: tree-if-conv.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ $(TREE_H) $(FLAGS_H) $(BASIC_BLOCK_H) $(TREE_FLOW_H) \ $(CFGLOOP_H) $(TREE_DATA_REF_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \ ! $(DBGCNT_H) $(GIMPLE_PRETTY_PRINT_H) $(TREE_VECTORIZER_H) tree-iterator.o : tree-iterator.c $(CONFIG_H) $(SYSTEM_H) $(TREE_H) \ coretypes.h $(GGC_H) tree-iterator.h $(GIMPLE_H) gt-tree-iterator.h tree-dfa.o : tree-dfa.c $(TREE_FLOW_H) $(CONFIG_H) $(SYSTEM_H) \ Index: trunk/gcc/tree-if-conv.c =================================================================== *** trunk.orig/gcc/tree-if-conv.c 2013-05-27 13:23:43.000000000 +0200 --- trunk/gcc/tree-if-conv.c 2013-05-28 14:18:43.964098216 +0200 *************** along with GCC; see the file COPYING3. *** 95,100 **** --- 95,101 ---- #include "tree-scalar-evolution.h" #include "tree-pass.h" #include "dbgcnt.h" + #include "tree-vectorizer.h" /* List of basic blocks in if-conversion-suitable order. */ static basic_block *ifc_bbs; *************** main_tree_if_conversion (void) *** 1848,1856 **** static bool gate_tree_if_conversion (void) { ! return ((flag_tree_vectorize && flag_tree_loop_if_convert != 0) ! || flag_tree_loop_if_convert == 1 ! || flag_tree_loop_if_convert_stores == 1); } struct gimple_opt_pass pass_if_conversion = --- 1849,1861 ---- static bool gate_tree_if_conversion (void) { ! /* If the option was explicitely specified enable the pass according ! to that. */ ! if (global_options_set.x_flag_tree_loop_if_convert ! || global_options_set.x_flag_tree_loop_if_convert_stores) ! return flag_tree_loop_if_convert || flag_tree_loop_if_convert_stores; ! return (flag_tree_vectorize != 0 ! && vectorizer_cost_model () != VECT_COST_MODEL_CHEAP); } struct gimple_opt_pass pass_if_conversion = Index: trunk/gcc/tree-ssa-pre.c =================================================================== *** trunk.orig/gcc/tree-ssa-pre.c 2013-04-22 13:30:25.000000000 +0200 --- trunk/gcc/tree-ssa-pre.c 2013-05-28 14:23:34.431338296 +0200 *************** along with GCC; see the file COPYING3. *** 44,49 **** --- 44,50 ---- #include "dbgcnt.h" #include "domwalk.h" #include "ipa-prop.h" + #include "tree-vectorizer.h" /* TODO: *************** inhibit_phi_insertion (basic_block bb, p *** 3026,3032 **** unsigned i; /* If we aren't going to vectorize we don't inhibit anything. */ ! if (!flag_tree_vectorize) return false; /* Otherwise we inhibit the insertion when the address of the --- 3027,3034 ---- unsigned i; /* If we aren't going to vectorize we don't inhibit anything. */ ! if (!flag_tree_vectorize ! || vectorizer_cost_model () == VECT_COST_MODEL_CHEAP) return false; /* Otherwise we inhibit the insertion when the address of the