General loop multi-versioning (pass) can be a very powerful enabler, especially when combined with FDO. The loop MV can be done based on many different conditions: 1) loop trip count (e.g, when we do value profiling on trip count, not based on average count); 2) memory aliasing 3) target options -- though we do not yet support region specific target options, the MVed loop can be outlined if believed to be profitable.
David On Wed, Oct 2, 2013 at 3:51 PM, Cong Hou <co...@google.com> wrote: > On Wed, Oct 2, 2013 at 2:18 PM, Xinliang David Li <davi...@google.com> wrote: >> On Wed, Oct 2, 2013 at 4:24 AM, Richard Biener <rguent...@suse.de> wrote: >>> On Tue, 1 Oct 2013, Cong Hou wrote: >>> >>>> When alias exists between data refs in a loop, to vectorize it GCC >>>> does loop versioning and adds runtime alias checks. Basically for each >>>> pair of data refs with possible data dependence, there will be two >>>> comparisons generated to make sure there is no aliasing between them >>>> in each iteration of the vectorized loop. If there are many such data >>>> refs pairs, the number of comparisons can be very large, which is a >>>> big overhead. >>>> >>>> However, in some cases it is possible to reduce the number of those >>>> comparisons. For example, for the following loop, we can detect that >>>> b[0] and b[1] are two consecutive member accesses so that we can >>>> combine the alias check between a[0:100]&b[0] and a[0:100]&b[1] into >>>> checking a[0:100]&b[0:2]: >>>> >>>> void foo(int*a, int* b) >>>> { >>>> for (int i = 0; i < 100; ++i) >>>> a[i] = b[0] + b[1]; >>>> } >>>> >>>> Actually, the requirement of consecutive memory accesses is too >>>> strict. For the following loop, we can still combine the alias checks >>>> between a[0:100]&b[0] and a[0:100]&b[100]: >>>> >>>> void foo(int*a, int* b) >>>> { >>>> for (int i = 0; i < 100; ++i) >>>> a[i] = b[0] + b[100]; >>>> } >>>> >>>> This is because if b[0] is not in a[0:100] and b[100] is not in >>>> a[0:100] then a[0:100] cannot be between b[0] and b[100]. We only need >>>> to check a[0:100] and b[0:101] don't overlap. >>>> >>>> More generally, consider two pairs of data refs (a, b1) and (a, b2). >>>> Suppose addr_b1 and addr_b2 are basic addresses of data ref b1 and b2; >>>> offset_b1 and offset_b2 (offset_b1 < offset_b2) are offsets of b1 and >>>> b2, and segment_length_a, segment_length_b1, and segment_length_b2 are >>>> segment length of a, b1, and b2. Then we can combine the two >>>> comparisons into one if the following condition is satisfied: >>>> >>>> offset_b2- offset_b1 - segment_length_b1 < segment_length_a >>>> >>>> >>>> This patch detects those combination opportunities to reduce the >>>> number of alias checks. It is tested on an x86-64 machine. >>> >>> Apart from the other comments you got (to which I agree) the patch >>> seems to do two things, namely also: >>> >>> + /* Extract load and store statements on pointers with zero-stride >>> + accesses. */ >>> + if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo)) >>> + { >>> >>> which I'd rather see in a separate patch (and done also when >>> the loop doesn't require versioning for alias). >> >> yes. >> >>> >>> Also combining the alias checks in vect_create_cond_for_alias_checks >>> is nice but doesn't properly fix the use of the >>> vect-max-version-for-alias-checks param >> >> Yes. The handling of this should be moved to >> 'vect_prune_runtime_alias_test_list' to avoid premature decisions. >> >> >> >>>which currently inhibits >>> vectorization of the HIMENO benchmark by default (and make us look bad >>> compared to LLVM). >> >> Here is a small reproducible: >> >> struct A { >> int *base; >> int offset; >> int offset2; >> int offset3; >> int offset4; >> int offset5; >> int offset6; >> int offset7; >> int offset8; >> }; >> >> void foo (struct A * ar1, struct A* ar2) >> { >> int i; >> for (i = 0; i < 10000; i++) >> { >> ar1->base[i] = 2*ar2->base[i] + ar2->offset + ar2->offset2 >> + ar2->offset3 + ar2->offset4 + ar2->offset5 + ar2->offset6; /* + >> ar2->offset7 + ar2->offset8;*/ >> } >> } >> >> GCC trunk won't vectorize it at O2 due to the limit. >> >> >> There is another problem we should be tracking: GCC no longer >> vectorize the loop (with large >> --param=vect-max-version-for-alias-checks=40) when -fno-strict-alias >> is specified. However with additional runtime alias check, the loop >> should be vectorizable. > > > The problem can be reproduced by the following loop: > > > void foo (int* a, int** b) > { > int i; > for (i = 0; i < 1000; ++i) > a[i] = (*b)[i]; > } > > > When -fno-strict-aliasing is specified, the basic address of (*b)[i] > which is *b could be modified by a[i] if alias exists between them. > This forbids GCC from making the basic address of (*b)[i] a loop > invariant, and hence could not do vectorization. Although we can still > check the aliasing between a[i] and *b from a and b (b < a or b >= > a+1000), it must be done before vectorization and after this we have > to do one more nested aliasing check (a and *b now): > > > void foo (int* a, int** b) > { > int i; > if ((int*)b < a || (int*)b >= a+1000) > { > int* c = *b; > for (i = 0; i < 1000; ++i) > a[i] = c[i]; > } > else > { > for (i = 0; i < 1000; ++i) > a[i] = (*b)[i]; > } > } > > I think this is too complicated to be able to get any benefit for us. > > > thanks, > Cong > > > >> >> David >> >> >>> >>> So I believe this merging should be done incrementally when >>> we collect the DDRs we need to test in vect_mark_for_runtime_alias_test. >>> >>> Thanks for working on this, >>> Richard. >>> >>>> >>>> thanks, >>>> Cong >>>> >>>> >>>> >>>> Index: gcc/tree-vect-loop-manip.c >>>> =================================================================== >>>> --- gcc/tree-vect-loop-manip.c (revision 202662) >>>> +++ gcc/tree-vect-loop-manip.c (working copy) >>>> @@ -19,6 +19,10 @@ You should have received a copy of the G >>>> along with GCC; see the file COPYING3. If not see >>>> <http://www.gnu.org/licenses/>. */ >>>> >>>> +#include <vector> >>>> +#include <utility> >>>> +#include <algorithm> >>>> + >>>> #include "config.h" >>>> #include "system.h" >>>> #include "coretypes.h" >>>> @@ -2248,6 +2252,74 @@ vect_vfa_segment_size (struct data_refer >>>> return segment_length; >>>> } >>>> >>>> +namespace >>>> +{ >>>> + >>>> +/* struct dr_addr_with_seg_len >>>> + >>>> + A struct storing information of a data reference, including the data >>>> + ref itself, its basic address, the access offset and the segment length >>>> + for aliasing checks. */ >>>> + >>>> +struct dr_addr_with_seg_len >>>> +{ >>>> + dr_addr_with_seg_len (data_reference* d, tree addr, tree off, tree len) >>>> + : dr (d), basic_addr (addr), offset (off), seg_len (len) {} >>>> + >>>> + data_reference* dr; >>>> + tree basic_addr; >>>> + tree offset; >>>> + tree seg_len; >>>> +}; >>>> + >>>> +/* Operator == between two dr_addr_with_seg_len objects. >>>> + >>>> + This equality operator is used to make sure two data refs >>>> + are the same one so that we will consider to combine the >>>> + aliasing checks of those two pairs of data dependent data >>>> + refs. */ >>>> + >>>> +bool operator == (const dr_addr_with_seg_len& d1, >>>> + const dr_addr_with_seg_len& d2) >>>> +{ >>>> + return operand_equal_p (d1.basic_addr, d2.basic_addr, 0) >>>> + && operand_equal_p (d1.offset, d2.offset, 0) >>>> + && operand_equal_p (d1.seg_len, d2.seg_len, 0); >>>> +} >>>> + >>>> +typedef std::pair <dr_addr_with_seg_len, dr_addr_with_seg_len> >>>> + dr_addr_with_seg_len_pair_t; >>>> + >>>> + >>>> +/* Operator < between two dr_addr_with_seg_len_pair_t objects. >>>> + >>>> + This operator is used to sort objects of dr_addr_with_seg_len_pair_t >>>> + so that we can combine aliasing checks during one scan. */ >>>> + >>>> +bool operator < (const dr_addr_with_seg_len_pair_t& p1, >>>> + const dr_addr_with_seg_len_pair_t& p2) >>>> +{ >>>> + const dr_addr_with_seg_len& p11 = p1.first; >>>> + const dr_addr_with_seg_len& p12 = p1.second; >>>> + const dr_addr_with_seg_len& p21 = p2.first; >>>> + const dr_addr_with_seg_len& p22 = p2.second; >>>> + >>>> + if (p11.basic_addr != p21.basic_addr) >>>> + return p11.basic_addr < p21.basic_addr; >>>> + if (p12.basic_addr != p22.basic_addr) >>>> + return p12.basic_addr < p22.basic_addr; >>>> + if (TREE_CODE (p11.offset) != INTEGER_CST >>>> + || TREE_CODE (p21.offset) != INTEGER_CST) >>>> + return p11.offset < p21.offset; >>>> + if (int_cst_value (p11.offset) != int_cst_value (p21.offset)) >>>> + return int_cst_value (p11.offset) < int_cst_value (p21.offset); >>>> + if (TREE_CODE (p12.offset) != INTEGER_CST >>>> + || TREE_CODE (p22.offset) != INTEGER_CST) >>>> + return p12.offset < p22.offset; >>>> + return int_cst_value (p12.offset) < int_cst_value (p22.offset); >>>> +} >>>> + >>>> +} >>>> >>>> /* Function vect_create_cond_for_alias_checks. >>>> >>>> @@ -2292,20 +2364,51 @@ vect_create_cond_for_alias_checks (loop_ >>>> if (may_alias_ddrs.is_empty ()) >>>> return; >>>> >>>> + >>>> + /* Basically, for each pair of dependent data refs store_ptr_0 >>>> + and load_ptr_0, we create an expression: >>>> + >>>> + ((store_ptr_0 + store_segment_length_0) <= load_ptr_0) >>>> + || (load_ptr_0 + load_segment_length_0) <= store_ptr_0)) >>>> + >>>> + for aliasing checks. However, in some cases we can decrease >>>> + the number of checks by combining two checks into one. For >>>> + example, suppose we have another pair of data refs store_ptr_0 >>>> + and load_ptr_1, and if the following condition is satisfied: >>>> + >>>> + load_ptr_0 < load_ptr_1 && >>>> + load_ptr_1 - load_ptr_0 - load_segment_length_0 < >>>> store_segment_length_0 >>>> + >>>> + (this condition means, in each iteration of vectorized loop, >>>> + the accessed memory of store_ptr_0 cannot be between the memory >>>> + of load_ptr_0 and load_ptr_1.) >>>> + >>>> + we then can use only the following expression to finish the >>>> + alising checks between store_ptr_0 & load_ptr_0 and >>>> + store_ptr_0 & load_ptr_1: >>>> + >>>> + ((store_ptr_0 + store_segment_length_0) <= load_ptr_0) >>>> + || (load_ptr_1 + load_segment_length_1 <= store_ptr_0)) >>>> + >>>> + Note that we only consider that load_ptr_0 and load_ptr_1 have the >>>> + same basic address. */ >>>> + >>>> + std::vector<dr_addr_with_seg_len_pair_t> ddrs_with_seg_len; >>>> + >>>> + /* First, we collect all data ref pairs for aliasing checks. */ >>>> + >>>> FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr) >>>> { >>>> struct data_reference *dr_a, *dr_b; >>>> gimple dr_group_first_a, dr_group_first_b; >>>> - tree addr_base_a, addr_base_b; >>>> tree segment_length_a, segment_length_b; >>>> gimple stmt_a, stmt_b; >>>> - tree seg_a_min, seg_a_max, seg_b_min, seg_b_max; >>>> >>>> dr_a = DDR_A (ddr); >>>> stmt_a = DR_STMT (DDR_A (ddr)); >>>> dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a)); >>>> if (dr_group_first_a) >>>> - { >>>> + { >>>> stmt_a = dr_group_first_a; >>>> dr_a = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_a)); >>>> } >>>> @@ -2314,20 +2417,11 @@ vect_create_cond_for_alias_checks (loop_ >>>> stmt_b = DR_STMT (DDR_B (ddr)); >>>> dr_group_first_b = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_b)); >>>> if (dr_group_first_b) >>>> - { >>>> + { >>>> stmt_b = dr_group_first_b; >>>> dr_b = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_b)); >>>> } >>>> >>>> - addr_base_a >>>> - = fold_build_pointer_plus (DR_BASE_ADDRESS (dr_a), >>>> - size_binop (PLUS_EXPR, DR_OFFSET (dr_a), >>>> - DR_INIT (dr_a))); >>>> - addr_base_b >>>> - = fold_build_pointer_plus (DR_BASE_ADDRESS (dr_b), >>>> - size_binop (PLUS_EXPR, DR_OFFSET (dr_b), >>>> - DR_INIT (dr_b))); >>>> - >>>> if (!operand_equal_p (DR_STEP (dr_a), DR_STEP (dr_b), 0)) >>>> length_factor = scalar_loop_iters; >>>> else >>>> @@ -2335,24 +2429,149 @@ vect_create_cond_for_alias_checks (loop_ >>>> segment_length_a = vect_vfa_segment_size (dr_a, length_factor); >>>> segment_length_b = vect_vfa_segment_size (dr_b, length_factor); >>>> >>>> + dr_addr_with_seg_len_pair_t dr_with_seg_len_pair >>>> + (dr_addr_with_seg_len >>>> + (dr_a, DR_BASE_ADDRESS (dr_a), >>>> + size_binop (PLUS_EXPR, DR_OFFSET (dr_a), DR_INIT (dr_a)), >>>> + segment_length_a), >>>> + dr_addr_with_seg_len >>>> + (dr_b, DR_BASE_ADDRESS (dr_b), >>>> + size_binop (PLUS_EXPR, DR_OFFSET (dr_b), DR_INIT (dr_b)), >>>> + segment_length_b)); >>>> + >>>> + if (dr_with_seg_len_pair.first.basic_addr > >>>> + dr_with_seg_len_pair.second.basic_addr) >>>> + std::swap (dr_with_seg_len_pair.first, dr_with_seg_len_pair.second); >>>> + >>>> + ddrs_with_seg_len.push_back (dr_with_seg_len_pair); >>>> + } >>>> + >>>> + /* Second, we sort the collected data ref pairs so that we can scan >>>> + them once to combine all possible aliasing checks. */ >>>> + >>>> + std::sort (ddrs_with_seg_len.begin(), ddrs_with_seg_len.end()); >>>> + >>>> + /* Remove duplicate data ref pairs. */ >>>> + ddrs_with_seg_len.erase (std::unique (ddrs_with_seg_len.begin(), >>>> + ddrs_with_seg_len.end()), >>>> + ddrs_with_seg_len.end()); >>>> + >>>> + /* We then scan the sorted dr pairs and check if we can combine >>>> + alias checks of two neighbouring dr pairs. */ >>>> + >>>> + for (size_t i = 1; i < ddrs_with_seg_len.size (); ++i) >>>> + { >>>> + dr_addr_with_seg_len& dr_a1 = ddrs_with_seg_len[i-1].first; >>>> + dr_addr_with_seg_len& dr_b1 = ddrs_with_seg_len[i-1].second; >>>> + dr_addr_with_seg_len& dr_a2 = ddrs_with_seg_len[i].first; >>>> + dr_addr_with_seg_len& dr_b2 = ddrs_with_seg_len[i].second; >>>> + >>>> + if (dr_a1 == dr_a2) >>>> + { >>>> + if (dr_b1.basic_addr != dr_b2.basic_addr >>>> + || TREE_CODE (dr_b1.offset) != INTEGER_CST >>>> + || TREE_CODE (dr_b2.offset) != INTEGER_CST) >>>> + continue; >>>> + >>>> + int diff = int_cst_value (dr_b2.offset) - >>>> + int_cst_value (dr_b1.offset); >>>> + >>>> + gcc_assert (diff > 0); >>>> + >>>> + if (diff <= vect_factor >>>> + || (TREE_CODE (dr_b1.seg_len) == INTEGER_CST >>>> + && diff - int_cst_value (dr_b1.seg_len) < vect_factor) >>>> + || (TREE_CODE (dr_b1.seg_len) == INTEGER_CST >>>> + && TREE_CODE (dr_a1.seg_len) == INTEGER_CST >>>> + && diff - int_cst_value (dr_b1.seg_len) < >>>> + int_cst_value (dr_a1.seg_len))) >>>> + { >>>> + if (dump_enabled_p ()) >>>> + { >>>> + dump_printf_loc >>>> + (MSG_NOTE, vect_location, >>>> + "combining two runtime checks for data references "); >>>> + dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b1.dr)); >>>> + dump_printf (MSG_NOTE, " and "); >>>> + dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b2.dr)); >>>> + dump_printf (MSG_NOTE, "\n"); >>>> + } >>>> + >>>> + dr_b1.seg_len = size_binop (PLUS_EXPR, >>>> + dr_b2.seg_len, size_int (diff)); >>>> + ddrs_with_seg_len.erase (ddrs_with_seg_len.begin () + i); >>>> + --i; >>>> + } >>>> + } >>>> + else if (dr_b1 == dr_b2) >>>> + { >>>> + if (dr_a1.basic_addr != dr_a2.basic_addr >>>> + || TREE_CODE (dr_a1.offset) != INTEGER_CST >>>> + || TREE_CODE (dr_a2.offset) != INTEGER_CST) >>>> + continue; >>>> + >>>> + int diff = int_cst_value (dr_a2.offset) - >>>> + int_cst_value (dr_a1.offset); >>>> + >>>> + gcc_assert (diff > 0); >>>> + >>>> + if (diff <= vect_factor >>>> + || (TREE_CODE (dr_a1.seg_len) == INTEGER_CST >>>> + && diff - int_cst_value (dr_a1.seg_len) < vect_factor) >>>> + || (TREE_CODE (dr_a1.seg_len) == INTEGER_CST >>>> + && TREE_CODE (dr_b1.seg_len) == INTEGER_CST >>>> + && diff - int_cst_value (dr_a1.seg_len) < >>>> + int_cst_value (dr_b1.seg_len))) >>>> + { >>>> + if (dump_enabled_p ()) >>>> + { >>>> + dump_printf_loc >>>> + (MSG_NOTE, vect_location, >>>> + "combining two runtime checks for data references "); >>>> + dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a1.dr)); >>>> + dump_printf (MSG_NOTE, " and "); >>>> + dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a2.dr)); >>>> + dump_printf (MSG_NOTE, "\n"); >>>> + } >>>> + >>>> + dr_a1.seg_len = size_binop (PLUS_EXPR, >>>> + dr_a2.seg_len, size_int (diff)); >>>> + ddrs_with_seg_len.erase (ddrs_with_seg_len.begin () + i); >>>> + --i; >>>> + } >>>> + } >>>> + } >>>> + >>>> + for (size_t i = 0, s = ddrs_with_seg_len.size (); i < s; ++i) >>>> + { >>>> + const dr_addr_with_seg_len& dr_a = ddrs_with_seg_len[i].first; >>>> + const dr_addr_with_seg_len& dr_b = ddrs_with_seg_len[i].second; >>>> + tree segment_length_a = dr_a.seg_len; >>>> + tree segment_length_b = dr_b.seg_len; >>>> + >>>> + tree addr_base_a >>>> + = fold_build_pointer_plus (dr_a.basic_addr, dr_a.offset); >>>> + tree addr_base_b >>>> + = fold_build_pointer_plus (dr_b.basic_addr, dr_b.offset); >>>> + >>>> if (dump_enabled_p ()) >>>> { >>>> dump_printf_loc (MSG_NOTE, vect_location, >>>> - "create runtime check for data references "); >>>> - dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a)); >>>> + "create runtime check for data references "); >>>> + dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a.dr)); >>>> dump_printf (MSG_NOTE, " and "); >>>> - dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b)); >>>> - dump_printf (MSG_NOTE, "\n"); >>>> + dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b.dr)); >>>> + dump_printf (MSG_NOTE, "\n"); >>>> } >>>> >>>> - seg_a_min = addr_base_a; >>>> - seg_a_max = fold_build_pointer_plus (addr_base_a, segment_length_a); >>>> - if (tree_int_cst_compare (DR_STEP (dr_a), size_zero_node) < 0) >>>> + tree seg_a_min = addr_base_a; >>>> + tree seg_a_max = fold_build_pointer_plus (addr_base_a, >>>> segment_length_a); >>>> + if (tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0) >>>> seg_a_min = seg_a_max, seg_a_max = addr_base_a; >>>> >>>> - seg_b_min = addr_base_b; >>>> - seg_b_max = fold_build_pointer_plus (addr_base_b, segment_length_b); >>>> - if (tree_int_cst_compare (DR_STEP (dr_b), size_zero_node) < 0) >>>> + tree seg_b_min = addr_base_b; >>>> + tree seg_b_max = fold_build_pointer_plus (addr_base_b, >>>> segment_length_b); >>>> + if (tree_int_cst_compare (DR_STEP (dr_b.dr), size_zero_node) < 0) >>>> seg_b_min = seg_b_max, seg_b_max = addr_base_b; >>>> >>>> part_cond_expr = >>>> @@ -2477,6 +2696,81 @@ vect_loop_versioning (loop_vec_info loop >>>> adjust_phi_and_debug_stmts (orig_phi, e, PHI_RESULT (new_phi)); >>>> } >>>> >>>> + /* Extract load and store statements on pointers with zero-stride >>>> + accesses. */ >>>> + if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo)) >>>> + { >>>> + >>>> + /* In the loop body, we iterate each statement to check if it is a >>>> load >>>> + or store. Then we check the DR_STEP of the data reference. If >>>> + DR_STEP is zero, then we will hoist the load statement to the loop >>>> + preheader, and move the store statement to the loop exit. */ >>>> + >>>> + for (gimple_stmt_iterator si = gsi_start_bb (loop->header); >>>> + !gsi_end_p (si); ) >>>> + { >>>> + gimple stmt = gsi_stmt (si); >>>> + stmt_vec_info stmt_info = vinfo_for_stmt (stmt); >>>> + struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info); >>>> + >>>> + >>>> + if (dr && integer_zerop (DR_STEP (dr))) >>>> + { >>>> + if (DR_IS_READ (dr)) >>>> + { >>>> + if (dump_file) >>>> + { >>>> + fprintf (dump_file, >>>> + "Hoist the load to outside of the loop:\n"); >>>> + print_gimple_stmt (dump_file, stmt, 0, >>>> + TDF_VOPS|TDF_MEMSYMS); >>>> + } >>>> + >>>> + basic_block preheader = loop_preheader_edge (loop)->src; >>>> + gimple_stmt_iterator si_dst = gsi_last_bb (preheader); >>>> + gsi_move_after (&si, &si_dst); >>>> + } >>>> + else >>>> + { >>>> + gimple_stmt_iterator si_dst = >>>> + gsi_last_bb (single_exit (loop)->dest); >>>> + gsi_move_after (&si, &si_dst); >>>> + } >>>> + continue; >>>> + } >>>> + else if (!dr) >>>> + { >>>> + bool hoist = true; >>>> + for (size_t i = 0; i < gimple_num_ops (stmt); i++) >>>> + { >>>> + tree op = gimple_op (stmt, i); >>>> + if (TREE_CODE (op) == INTEGER_CST >>>> + || TREE_CODE (op) == REAL_CST) >>>> + continue; >>>> + if (TREE_CODE (op) == SSA_NAME) >>>> + { >>>> + gimple def = SSA_NAME_DEF_STMT (op); >>>> + if (def == stmt >>>> + || gimple_nop_p (def) >>>> + || !flow_bb_inside_loop_p (loop, gimple_bb (def))) >>>> + continue; >>>> + } >>>> + hoist = false; >>>> + break; >>>> + } >>>> + >>>> + if (hoist) >>>> + { >>>> + basic_block preheader = loop_preheader_edge (loop)->src; >>>> + gimple_stmt_iterator si_dst = gsi_last_bb (preheader); >>>> + gsi_move_after (&si, &si_dst); >>>> + continue; >>>> + } >>>> + } >>>> + gsi_next (&si); >>>> + } >>>> + } >>>> + >>>> /* End loop-exit-fixes after versioning. */ >>>> >>>> if (cond_expr_stmt_list) >>>> Index: gcc/ChangeLog >>>> =================================================================== >>>> --- gcc/ChangeLog (revision 202663) >>>> +++ gcc/ChangeLog (working copy) >>>> @@ -1,3 +1,8 @@ >>>> +2013-10-01 Cong Hou <co...@google.com> >>>> + >>>> + * tree-vect-loop-manip.c (vect_create_cond_for_alias_checks): Combine >>>> + alias checks if it is possible to amortize the runtime overhead. >>>> + >>>> >>>> >>> >>> -- >>> Richard Biener <rguent...@suse.de> >>> SUSE / SUSE Labs >>> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 >>> GF: Jeff Hawn, Jennifer Guild, Felix Imend